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1 .0  Center  for  Advanced  Sensors 


1.1  Administrative 

This  report  covers  the  activities  in  the  Center  for  Year  Two  (FY06).  Steven  Murrill,  the 
Cooperative  Agreement  Manager,  from  the  Army  Research  Laboratory  (ARL),  Adelphi, 
MD,  met  frequently  with  the  Director  of  the  Center  to  discuss  Year  Two  activities.  A  no 
cost  extension  for  Year  Two  extended  the  period  of  performance  through  30  November 
2007,  primarily  to  accommodate  research  at  Vanderbilt  University.  Vanderbilt  did  not 
receive  a  subcontract  from  year  three  funds.  Therefore,  they  requested  extra  time  to  bring 
their  year  two  efforts  to  a  conclusion. 


1.2  Center  Activities 

The  principal  researchers  at  The  University  of  Memphis  report  their  research  activities  in 
Sections  2  through  6  of  this  report.  Vanderbilt  University  researchers  report  their 
activities  in  Section  7  of  this  report. 

The  Center  for  Advanced  Sensors  continues  to  support  NATO  activities  in  the  area  of 
sensor  performance  modeling  for  sensors  incorporating  image  processing  enhancements. 
The  Director  attended  the  initial  meeting  of  the  Exploratory  Team  for  Super  resolution 
and  Advanced  Signal  Processing  for  EO/IR  Sensor  Performance  (ET5 1).  The 
Exploratory  Team  is  under  the  Sensors  and  Electronics  Technology  Panel  for  the 
Research  and  Technology  Agency  for  NATO.  Current  activity  is  modeling  multiband 
sensor  performance  with  image  processing  enhancements. 


1.2.1  Providing  Support  to  Research  Efforts 

Center  personnel  provided  support  to  research  efforts  at  ARL  in  the  area  of 
Intelligence/Surveillance/Reconnaissance  (ISR)  sensors.  This  activity  continues 
principally  through  the  research  of  Drs.  Robinson  and  Perry.  Dr.  Russomanno’s  research 
in  sensor  networks  is  also  in  conjunction  with  ARL  research. 

The  Army’s  Night  Vision  and  Electronic  Sensors  Directorate  (NVESD)  was  supported  by 
frequent  contact  with  Dr.  Halford  and  Robinson.  Support  was  primarily  for  the  Modeling 
and  Simulation  Division’s  research  activities. 

Dr.  Halford  continued  to  support  Dr.  Tim  Edwards  (under  separate  funding)  from  the 
Army’s  Redstone  Technical  Test  Center  (RTTC).  One  of  the  papers  presented  at  SPIE 
Orlando  reported  human  perception  measurements  at  the  Center.  The  effort  established 
the  relationship  between  the  number  of  cues  and  observer  performance  in  identifying 
combat  vehicles. 


2 


1.2.2  Perception  Laboratory 

The  Center’s  Perception  Laboratory  continues  to  support  military  research.  NVESD 
perception  experiments  throughout  the  period  of  performance  were  conducted  at  The 
University  of  Memphis.  Additional  experiments  are  planned  for  the  foreseeable  future. 
The  long  wave  IR  imager  continues  to  support  Dr.  Edward  Perry’s  research.  The 
multispectral  MWIR  imager  has  acquired  images  to  support  research  on  filter  design  by 
independent  component  analysis  for  feature  extraction  in  multi-band  images. 
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2.0  Ontology-Based  Sensor  Network  Environment 

This  section  summarizes  the  Year  Two  activities  related  to  an  ontology-based  sensor 
network  environment  supported  from  FY2006  Center  for  Advanced  Sensors  funds.  Dr. 
David  Russomanno,  along  with  graduate  students  Mr.  Caleb  Goodwin,  Mr.  Joseph 
Qualls,  and  Mr.  Andy  Lemmon,  and  undergraduate  student  Mr.  Timothy  Henley 
contributed  to  Year  Two  activities.  Additional  details  on  the  activities  can  be  obtained 
from  the  quarterly  reports,  which  include  the  published  and  submitted  articles  during 
Year  Two. 

2.1  Milestone  One:  Develop  ontology-based  sensor  network 
prototype  environment 

The  primary  objective  of  milestone  one  is  to  continue  research  and  development  of  an 
ontology -based  sensor  network  prototype  environment  to  investigate  the  use  of  Semantic 
Web  infrastructure  and  service-oriented  architecture  (SOA)  [Singh,  2005]  in  the 
deployment  of  sensors  and  wireless  sensor  networks  which  can  be  dynamically 
discovered  and  tasked. 

2.1.1  Activity 

1.  Monitored  the  activity  surrounding  the  SensorML  initiative  [OGC  04-019,  2004]  and 
its  applicability  to  the  prototype.  Reviewed  the  revised  OGC  SensorML  specification 
[OGC  05-086r2,  2006]  and  its  impact  on  the  design  of  OntoSensor  [Russomanno,  2005; 
Goodwin,  2006].  Determined  that  OntoSensor  will  require  significant  revision  of  its 
logical  data  model  to  be  utilized  by  those  researchers  that  adopt  the  SensorML 
specification. 

2.  Developed  a  test  interface  for  the  Crossbow  MTS3 10  wireless  sensors.  The  MTS3 10 
has  bi-axial  accelerometer,  bi-axial  magnetometer,  acoustic,  temperature,  and  photo¬ 
sensitive  light  sensing  elements  [CrossBow,  2006].  Developed  a  graphical  user  interface 
(GUI)  to  program  and  test  the  MTS3 10.  The  GUI  facilitates  simple  programming  of  the 
MTS310  sensors,  including  monitoring  transmit/receive  communications,  as  well  as 
setting  temperature  alarm  thresholds  given  a  client  identification  and  other  functionality. 
The  purpose  of  the  application  is  to  serve  as  a  first  step  toward  implementing  a  services- 
based  architecture  for  a  wide-variety  of  wireless  sensors  to  broaden  the  prototype’s 
capabilities. 

3.  Reviewed  several  research  approaches  to  support  semantic  service  description  and 
matchmaking  with  registries  that  use  an  existing  Universal  Discovery  and  Description 
Interface  (UDDI)  specification  for  applicability  to  the  prototype.  Approaches  reviewed 
included  one  proposed  by  the  Naval  Research  Laboratory  (NRL)  [Luo,  2005].  Although 
the  NRL  effort  does  not  specifically  include  sensors  and  sensor  services  in  their  research, 
the  NRL  activity  outlined  an  approach  to  bulk-load  semantic  data  into  UDDI  tModels 
before  queries  are  issued.  TModels  provide  a  mechanism  to  extend  UDDI  repositories 
with  external  information.  We  elected  to  implement  a  modified  approach  for  the 
prototype  in  which  the  entire  sensor  ontology  is  not  bulk-loaded  into  tModels,  but  rather 
only  that  portion  of  the  ontology  that  was  used  to  satisfy  a  previous  query  for  sensor 
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types  or  capabilities  is  loaded  into  tModels.  In  this  approach,  when  subsequent  queries 
are  issued  to  the  prototype  concerning  sensor  capabilities,  the  tModels  serve  as  a  type  of 
cache  for  satisfying  the  query. 

4.  Continued  to  refine  the  sensor  semantic  service  description  and  match  approach  for  the 
prototype  environment.  Activity  focused  on  developing  an  approach  to  associate  a 
relevant  subset  of  OntoSensor  with  a  UDDI  registry  of  sensor  services  to  enable  semantic 
searching  for  specifications  of  sensor  capabilities.  There  is  a  need  to  augment  the  specific 
facts  in  the  UDDI  repository  with  relevant  ontological  schema  information  to  more 
effectively  solve  subsequent  queries  to  locate  sensor  services  that  require  semantic 
matchmaking.  C.  Goodwin  completed  his  M.S.  thesis  entitled:  “Ontology  Integration 
within  a  Service-Oriented  Architecture  for  Sensor  Networks.”  His  work  advanced  the 
prototype  environment  in  the  laboratory  in  that  sensor  services  can  be  listed  in  a  registry 
that  references  a  machine-interpretable  ontology.  The  registry  conforms  to  the  UDDI 
specification,  but  it  is  augmented  with  semantic  matching  via  the  OntoSensor  ontology. 
Although  the  approach  has  several  practical  limitations,  it  is  illustrative  of  how  Semantic 
Web  and  SOA  infrastructure  can  be  used  in  sensor  network  deployments. 

5.  Continued  to  investigate  Microsoft  Research  published  efforts  related  to  sensor 
services  [Liu,  2005a;  Liu,  2006b;  Woo,  2006]  for  applicability  to  the  prototype. 

Microsoft  Research  developed  a  Networked  Embedded  Sensing  Toolkit  (MSR  Sense) 
which  is  “a  collection  of  software  tools  that  allow  users  to  collect,  process,  archive,  and 
visualize  data  from  a  sensor  network.”  It  is  unclear  at  this  time  if  MSR  Sense  fits  within 
the  conceptual  framework  of  our  work  and  can  be  leveraged  within  our  prototype 
environment. 

6.  Investigated  the  concept  of  semantic  streams  for  interpretation  of  sensor  data 
[Whitehouse,  2006]  using  a  logic-based  framework.  The  approach  may  benefit  from  an 
ontology  such  as  OntoSensor  and  a  methodology  for  sensor  service  descriptions. 

7.  Evaluated  the  use  of  the  Altova  SemanticWorks  product  for  used  with  the  OntoSensor 
ontology. 

8.  Wrote  three  papers  related  to  Year  Two  activities  of  which  the  first  was  published,  the 
second  was  accepted  and  is  currently  in  press,  and  the  third  is  in  peer  review: 

C.  Goodwin,  D.J.  Russomanno  and  J.  Qualls  (2007)  “Survey  of  Semantic  Extensions  to 
UDDI:  Implications  for  Sensor  Services,”  Proceedings  of  the  2007  International 
Conference  on  Semantic  Web  and  Web  Services,  CSREA  Press,  Las  Vegas,  Nevada,  pp. 
16-22. 

D. J.  Russomanno  and  J.C.  Goodwin  (accepted,  in  press)  “OntoSensor:  An  ontology  for 
sensor  network  application  development,  deployment,  and  management,”  In  G.  Aggelou 
(ed.),  Handbook  of  Wireless  Mesh  &  Sensor  Networking ,  McGraw-Hill  International, 

New  York,  NY. 
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C.  Goodwin  and  D.  J.  Russomanno  (in  peer  review)  “Ontology  Integration  within  a 
Service-Oriented  Architecture  for  Expert  System  Applications  using  Sensor  Networks,” 
Expert  Systems. 

2-2  Milestone  Two:  Modify/adapt  existing  sensor  ontology  to 
account  for  target  acquisition  (TA)  sensors  as  well  as 
Intelligence ,  Surveillance  &  Reconnaissance  (ISR)  sensors 

This  effort  required  investigating  the  use  of  Semantic  Web  compliant  interfaces  for  IR 
cameras,  TA  and  ISR  sensors  (or  their  surrogates)  to  join  the  prototype  sensor  network 
(that  is,  sensors  that  plug  and  play  using  Semantic  Web  infrastructure). 

2.2.1  Activity 

1.  Pursued  integration  of  a  SONY  XCD-SX910  camera  within  the  prototype 
environment.  Students  configured  an  XCD-SX910  camera  using  National  Instruments 
(NI)  Vision  Acquisition  Software  to  acquire,  display,  and  save  images. 

2.  Investigated  creating  a  SOA  interface  for  a  Sony  XCD-SX910  camera.  A  new  API  was 
released  for  this  camera  and  J.  Qualls  reviewed  the  API  for  possible  integration  into  the 
ontology -based  sensor  network  prototype  environment.  Several  executables  were  created 
with  calling  conventions  to  enable  invocation  of  the  camera  as  a  Web  service. 
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3.0  Performance  Modeling  of  Sensors  with  Image  Processing 
Enhancements  (Dr.  Carl  Halford) 

Contacts:  Dr.  Ronald  Driggers,  Dr.  Gary  Wood 
Reference  ARL  BAA  Paragraph  Reference:  2.6,  2.9 

Papers  Published: 

S.  Chari,  C.E.  Halford,  A.L.  Robinson  and  E.L.  Jacobs,  "Multispectral  infrared  image 
classification  using  filters  derived  from  independent  component  analysis,”  Optical 
Engineering,  vol.  46,  116401,  2007. 

R.L.  Espinola,  E.L.  Jacobs,  C.E.  Halford,  D.H.  Tofsted  and  R.  Vollmerhausen, 

"Modeling  the  target  acquisition  performance  of  active  imaging  systems,"  Optics 
Express,  vol.  15,  March,  2007. 

C.E.  Halford,  A.L.  Robinson,  E.L.  Jacobs  and  R.G.  Driggers,  “Tilted  surfaces  in  SWIR 
imagery:  speckle  simulation  and  a  simple  contrast  model,”  Optical  Engineering,  vol.  46, 
May,  2007. 

R.  Driggers,  J.  Reynolds,  J.  Visgaitis,  S,  Burks,  J.  Franck,  C.  Howell,  C.  Halford,  N.  Gat, 
K.  Krapels,  "Performance  benefits  of  dual  f-number  infrared  systems,"  National  MSS 
Symposium,  Las  Vegas,  NV,  November,  2007. 

V.A.  Hodgkin,  C.E.  Halford  and  T.  Maurer,  "The  impact  of  atmospheric  path  radiance  on 
MWIR  and  LWIR  sensor  performance,"  SPIE  Conference  Proceedings ,  vol.  6543,  April, 
2007. 

M.A.  Brickell,  T.C.  Edwards,  C.E.  Halford  and  K.M.  Dennen,  "Correlation  between 
human  observer  performance  and  the  number  of  spatial,  thermal,  and  total  cues  in  LWIR 
imagery,"  SPIE  Conference  Proceedings,  vol.  6543,  April,  2007. 

R. L.  Espinola,  E.L.  Jacobs  and  C.E.  Halford,  "Active  imaging  system  performance  model 
for  target  acquisition,"  SPIE  Conference  Proceedings ,  vol.  6543,  April,  2007. 

S. M.  Salem,  R.K.  Moore,  P.  Bijl,  M.A.  Hogervorst  and  C.E.  Halford,  "The  effects  of 
spatial  band-limited  noise  on  human  performance  for  tank  identification,"  SPIE 
Conference  Proceedings,  vol.  6543,  April,  2007. 

C.L.  Howell,  R.K.  Moore,  S.D.  Burks  and  C.E.  Halford,  "An  evaluation  of  fusion 
algorithms  using  image  metrics  and  human  identification  performance,"  SPIE 
Conference  Proceedings,  vol.  6543,  April,  2007. 

S.  Chari,  C.E.  Halford  and  E.L.  Jacobs,  "Multispectral  infrared  image  classification  using 
filters  derived  from  independent  component  analysis, "  SPIE  Conference  Proceedings, 
vol.  6576,  April,  2007. 


R.K.  Moore,  E.L.  Jacobs  and  C.E.  Halford,  "Cell-balancing  for  vehicle  identification 
perception  experiments  and  correcting  for  cell  imbalance  in  test  results,"  SPIE 
Conference  Proceedings ,  vol.  6543,  April,  2007. 

R.K.  Moore,  C.E.  Halford  and  C.L.  Howell,  "Finding  a  fusion  metric  that  best  reflects 
human  observer  preference, "  SPIE  Conference  Proceedings ,  vol.  6543,  April,  2007. 

J.  Fanning,  J.  Miller,  J.  Park,  G.  Tener,  J.  Reynolds,  P.  O'Shea,  C.E.  Halford  and  R.G. 
Driggers,  "IR  system  field  performance  with  superresolution,"  SPIE  Conference 
Proceedings,  vol  6543,  April,  2007. 

3.1  Milestone  One  (2007):  Develop  Image  Quality  Metrics  for 
Fused  Imagery 

Perception  tests  were  conducted  using  fused  visible  and  MWIR  imagery  to  analyze 
human  performance  with  identifying  the  eight  target  set  developed  by  the  Night  Vision 
Electronics  and  Sensors  Directorate  (NVESD).  We  are  determining  if  any  of  the  metrics 
under  investigation  show  significant  correlation  with  human  performance.  For  modeling 
purposes,  we  also  wish  to  know  how  much  improvement  in  ID  performance  is  yielded  by 
each  fusion  algorithm  for  thermal  and  visible  fusion  of  tank  vehicles.  An  initial  analysis 
of  the  data  from  the  perception  testing  has  been  carried  out  and  the  preliminary  results  are 
shown  below  in  Figure  1 .  Examples  of  the  images  used  in  the  perception  test  are  given  in 
Appendix  B. 


Figure  3.1.1:  Initial  results  from  the  tank  ID  perception  test. 
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During  the  perception  test,  each  observer  was  shown  either  an  image  of  both  source 
bands  side  by  side  (“brain  fusion”),  an  image  of  the  individual  source  bands  separately,  or 
a  fused  image  using  the  source  bands  as  inputs.  The  observers  were  tasked  to  correctly 
identify  the  tank  in  the  displayed  image.  The  initial  results  suggest  that  the  best 
performance  is  not  to  be  expected  by  displaying  the  two  source  bands  side  by  side  as 
anticipated.  Surprisingly,  color  averaging  and  the  opponent  processing  fusion  processes 
outperformed  the  “brain  fusion”  technique.  This  is  in  direct  opposition  to  our  original 
hypothesis  that  “brain  fusion”  would  offer  the  best  possible  performance.  Further  testing 
is  needed  to  investigate  this  matter.  The  current  test  results  conflict  with  our  previous 
fusion  tank  ID  pilot  experiment.  Neither  of  these  pilot  experiments  had  enough  observers 
to  yield  curves  with  non-overlapping  error  bars. 

The  above  result  supports  that  a  gain  in  performance  can  be  realized  by  using  fusion. 
Ideally,  a  good  fusion  process  should  offer  more  salient  information  in  the  fused  image 
compared  to  any  of  the  source  images  alone.  For  example,  Figure  2  shows  a  case  where 
fusion  helps  with  the  identification  of  a  vehicle.  False  coloring  is  used  to  assist  with 
identifying  those  signatures  specific  to  each  spectral  band. 


Figure  3. 1.2.1:  Rear  aspect;  Top  left:  Visible  band.  Top  right:  MWIR  band.  Bottom:  Fused 


Figure  3. 1.2.2:  Side  aspect;  Top  left:  Visible  band.  Top  right:  MWIR  band.  Bottom:  Fused 


10 


The  top  left  images  in  Figure  3.1.2a  &  3.1.2b  are  rear  aspects  of  the  visible  images  and 
the  top  right  are  side  aspects  of  the  MWIR  images.  The  bottom  images  in  each  group  are 
the  fused  images.  Referring  to  Figure  2a,  the  MWIR  image  shows  a  clear  identifiable 
thermal  signature  while  no  signature  stands  out  in  the  visible  image.  In  Figure  2b,  the 
opposite  is  true.  The  side  aspect  allows  for  easy  identifiable  traits  to  be  seen  in  the 
visible  image  but  the  MWIR  image  signatures  can  not  readily  be  associated  with  any  one 
particular  target.  By  looking  at  the  respective  fused  images,  it  is  seen  that  the  salient 
information  from  each  band  is  included  in  the  fused  image  and  using  the  fused  image 
alone  gives  a  better  indication  of  the  type  of  target  in  the  image  than  using  one  band 
independently. 

The  results  offered  in  this  report  are  preliminary  results  and  more  observers  are  scheduled 
to  take  the  perception  test  August  20-24,  2007.  We  feel  that  the  increase  in  observers  will 
allow  us  to  determine  if  any  learning  effects  are  affecting  the  results  of  the  tests  or  if 
there  is  new  information  in  the  fused  image  uncommon  to  either  of  the  individual  source 
bands. 

3-1-1  Fusion  introduction 

Images  of  different  spectral  bands  can  be  combined  in  various  ways  to  produce  single 
images  containing  information  from  all  of  the  source  bands.  Many  image  fusion 
algorithms  have  been  devised  and  are  found  in  open  literature.  At  best,  image  fusion  can 
produce  better  observer  task  performance  than  is  possible  using  either  of  the  source 
imagery  bands.  At  worst,  fused  imagery  can  yield  observer  performance  that  is 
significantly  lower  than  performance  achieved  using  the  source  imagery.  Fusion 
algorithms  seem  to  differ  in  the  degrees  to  which  they  successfully  handle  source 
imagery  with  noise,  mis-registration,  differing  spatial  resolution,  and  opposing  contrast. 
The  algorithms  also  have  widely  varying  memory  requirements  and  computational 
complexity.  In  an  attempt  to  understand  how  several  different  image  fusion  algorithms 
affect  the  ability  of  observers  to  perform  visual  tasks,  several  perception  tests  were 
devised  and  performed  at  the  Center  for  Advanced  Sensors  at  the  University  of  Memphis. 

3.1.2  Fusion  preference  paired  comparison 

First,  a  paired  comparison  test  was  performed  to  see  if  observers  prefer  images  from  a 
particular  fusion  algorithm  over  the  source  bands  or  other  fused  images.  An  attempt  was 
also  made  to  find  a  fusion  quality  metric  that  reflects  the  perceived  quality  level  of  fused 
imagery.  None  of  the  fusion  quality  metrics  that  we  investigated  exhibited  such  a 
correlation  with  observer  preference.  The  results  were  presented  in  the  2006  SPIE 
Defense  and  Security  Symposium. 

3. 1.2.1  Results 

Perception  tests  have  been  conducted  to  compare  the  fusion  metrics  [1]  to  human 
observer  preference.  Five  observers  participated  in  two  alternative  forced  choice  (2AFC) 
experiments.  Observers  were  asked  to  indicate  for  pairs  of  images  which  image  had 
higher  quality.  There  were  two  perception  experiments  administered,  one  with  the 
‘Lights’  images  and  the  other  with  the  ‘Bus’  images.  The  preference  results  of  the 
observers  were  calculated  and  plots  of  their  perceived  image  quality  are  given  below. 
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Figure  3-1.  Preference  Plots 

In  both  cases,  the  Gabor-fused  images  faired  poorly.  The  fusion  technique  that  we  chose 
with  Gabor  filters  introduced  a  noticeable  blur  that  detracted  from  the  perceived  image 
quality.  Pixel  averaging,  PCA  averaging,  and  the  SIDWT  all  yielded  consistently  high 
preferences  in  both  the  ‘Bus’  and  ‘Lights’  image  sets.  The  relative  preference  of  the 
DWT-fused  images  changed  drastically  from  the  ‘Lights’  images  to  the  ‘Bus’  images. 
This  may  be  because  the  DWT-fused  ‘Lights’  image  exhibited  obvious  distortion  artifacts 
due  to  fusion,  while  the  DWT-fused  ‘Bus’  image  did  not.  This  dropped  the  preference 
for  the  DWT  in  the  ‘Lights’  images,  while  allowing  the  DWT  to  still  perform  well  in  the 
‘Bus’  imagery  test.  This  change  in  preference  due  to  fusion-introduced  distortion  proved 
important  in  determining  which  fusion  metric  best  represents  observer  preference. 

It  was  also  noted  that  in  the  ‘Lights’  test,  the  MWIR  image  was  more  preferred  than  any 
of  the  fused  imagery.  Upon  reviewing  the  LWIR  and  MWIR  images,  it  was  noted  that 
more  information  appears  to  be  present  in  the  MWIR  image  than  in  the  LWIR  image  of 
the  ‘Lights’  scene.  Furthermore,  the  LWIR  ‘Lights’  image  does  not  seem  to  contain  any 
information  that  is  not  also  present  in  the  MWIR  image,  so  the  ‘Lights’  images  were  not 
an  ideal  image  set  for  a  image  fusion.  A  more  appropriate  image  set  would  yield  fused 
images  that  would  be  more  preferred  than  either  of  the  input  images.  This  brings  up  an 
interesting  issue  in  image  fusion:  when  is  fusion  likely  to  be  beneficial?  When  multiple 
spectral  bands  are  available,  one  band  may  present  a  clear  advantage  over  any  other.  In 
such  a  case,  fusion  will  not  be  ideal.  In  other  cases,  two  or  more  bands  may  contain 
unique  information  that  could  potentially  be  fused  into  a  single  image.  It  is  likely, 
though,  that  an  observer  using  a  multi- spectral  sensor  will  not  know  which  of  the 
available  bands  is  most  useful  at  an  instant  and  may  also  rely  on  a  fusion  algorithm  to 
‘sift  through’  the  available  bands  for  all  available  information,  even  if  it  is  all  to  be  found 
in  a  single  band.  Here,  the  fusion  algorithm  needs  to  be  transparent  enough  to  not 
introduce  perceived  degradation  to  the  imagery.  So,  the  case  of  image  fusion  with  only 
one  useful  band  is  still  worth  investigation,  since  it  may  occur  in  practice. 
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The  preference  results  were  correlated  with  the  fusion  quality  metrics  listed  in  [1]  to 
determine  which  fusion  metric  best  correlates  with  the  observers’  preference.  The 
correlation  coefficients  for  the  ‘Lights’  and  ‘Bus’  images  are  listed  in  the  following  table. 

Table  3.  Correlation  coefficients  relating  the  fusion  metrics  to  the  observer 

preferences 


FQI 

WFQI 

EDFQI 

MINF 

MS(F- 

L) 

MS(F- 

M) 

Correlation 

to 

preference 

for 

‘Lights’ 

0.360 

4 

0.1476 

-0.0374 

0.7193 

0.3207 

0.3810 

Correlation 

to 

preference 
for  ‘Bus’ 

0.695 

0 

0.9825 

0.9488 

0.8231 

-0.2967 

0.5969 

Both  of  the  Mannos-Sakrison  filter  metrics  yielded  results  that  correlated  poorly  with 
observer  preference.  The  FQI  also  did  not  perform  particularly  well  when  correlated  with 
the  measured  observer  performances  for  the  ‘Lights’  or  ‘Bus’  images.  While  none  of  the 
metrics  performed  especially  well  for  the  ‘Lights’  images,  the  mutual  information  metric 
correlated  best  with  the  observers’  preference.  The  Mutual  Information  measure  also 
correlated  best  overall  when  both  the  ‘Lights’  and  ‘Bus’  results  are  considered.  Both  the 
WFQI  and  the  EDFQI  correlated  extremely  well  with  the  ‘Bus’  preferences,  but  both  also 
correlated  quite  poorly  with  the  ‘Lights’  preferences.  This  may  be  because  of  the  fusion 
distortion  present  in  the  DWT-fused  ‘Lights’  image.  Both  of  these  metrics  rated  the 
DWT  images  highly  for  both  the  ‘Bus’  and  ‘Lights’  images.  Neither  was  able  to  tell, 
though,  that  distortion  was  present  in  the  DWT-fused  ‘Lights’  image  that  would  cause  its 
perceived  quality  to  drop  considerably. 

The  following  scatter  plots  compare  the  different  fusion  metrics  with  the  measured 
perceived  image  quality  of  the  fused  images.  The  correlation,  or  lack  thereof,  can  be  seen 
for  each  fusion  metric. 
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Figure  3-2.  Scatter  plots  of  the  metrics  and  the  measured  preferences  for  the  ‘Bus’  images. 

It  was  noted  previously  in  the  ‘Bus’  images  that  the  WFQI  and  EDFQI  both  demonstrated 
good  correlation  with  the  observer’s  preference.  This  is  also  apparent  in  the  scatter  plots. 
Both  the  WFQI  and  the  EDFQI  show  strong  linear  correlations  with  the  perceived  image 
quality.  The  mutual  information  plot  also  shows  some  correlation  to  the  perceived  image 
quality.  In  Figure  5,  the  scatter  plots  are  shown  for  the  ‘Lights’  images.  Visually,  the 
correlations  are  quite  poor  for  almost  all  of  the  metrics.  The  mutual  information  still 
exhibits  some  correlation  to  the  perceived  image  quality,  as  was  shown  in  the  correlation 
coefficients  of  Table  3.  Of  the  fusion  metrics  tested  here,  mutual  information  was  found 
to  be  the  best  indicator  of  perceived  image  quality  for  MWIR/LWIR  fused  images. 
Among  the  fusion  methods  used  in  the  paired  comparison  test,  the  SIDWT  was 
consistently  preferred  above  all  other  fusion  methods.  It  is  to  be  noted  that  the  Pixel- 
averaged  images  also  yielded  a  consistently  high  perceived  image  quality  with  a  very 
small  processing  time. 
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Figure  3-3.  Scatter  plots  of  the  metrics  and  the  measured  preferences  for  the  ‘Lights’ 
images 

3. 1.2.2  Fusion  preference  conclusions 

While  the  mutual  information  most  consistently  correlated  with  the  measured  observer 
preferences,  none  of  the  fusion  metrics  tested  here  seemed  to  measure  fused  image 
quality  in  a  way  similar  to  the  human  observers.  It  is  noted  that  observers  consistently 
deemed  the  pixel-averaged  and  SIDWT  fused  images  to  be  of  high  quality  relative  to  the 
other  images  in  the  set.  Future  work  will  include  task  performance  based  perception 
experiments  to  find  an  image  metric  that  correlates  well  to  an  observer’s  ability  to 
perform  a  visual  task.  Knowing  an  image  fusion  metric  that  relates  well  to  observer 
performance  will  help  NVESD  to  model  the  effects  of  image  fusion  on  target  acquisition. 

3.1.3  Number  identification  performance  with  fused  imagery 

A  second  perception  test  was  run  to  determine  if  an  improvement  in  performance  due  to 
fusion  of  source  imagery  could  be  measured.  Two  synthetic  source  bands  were  created 
by  splitting  up  images  of  numbers.  The  source  band  were  then  fused  several  different 
ways  and  presented  to  observers  for  identification  in  a  perception  experiment.  A 
significant  improvement  in  observer  performance  was  measured  for  several  of  the  test 
fusion  algorithms. 

For  image  fusion  to  be  useful,  each  source  band  must  contain  some  information  not 
present  in  the  other  band.  Also,  some  of  this  unique  information  must  be  useful  to  the 
observer  who  is  performing  some  assigned  task.  If  these  conditions  are  not  met,  fusion 
will  not  increase  the  observer’s  ability  to  perform  his  task.  The  task  to  be  measured  in 
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this  test  was  number  identification.  Portions  of  number  images  in  two  source  band 
images  were  fused  together  using  several  techniques  before  being  presented  to  the 
observers  for  identification. 

The  five  fusion  methods  tested  were  a  discrete  wavelet  transform  method,  pixel 
averaging,  a  Gabor  filtering  method,  a  principle  component  analysis  method,  and  a  shift- 
invariant  wavelet  transform  method.  Most  of  the  methods  break  down  the  source  images 
with  some  set  of  basis  functions  and  then  compare  coefficients  to  determine  information 
content  in  each  band  of  each  source  image.  This  information  is  then  used  to  pick  layers 
for  the  construction  of  the  fused  image.  Observers  were  asked  to  identify  images  of 
numbers  0  through  9  in  a  10  alternative  forced-choice  experiment.  The  source  band 
images  containing  the  incomplete  images  of  the  numbers  were  also  displayed  to  the  test 
subjects  for  identification.  Six  different  font  sizes  were  tested  to  allow  performance 
curves  to  be  generated  for  each  fusion  method.  The  numbers  displayed  were  selected  at 
random  according  to  a  uniform  probability  density  function. 

In  the  analysis,  the  mean  probability  of  identification  as  a  function  of  font  size  will  be 
plotted  to  see  which  fusion  method  yielded  the  best  observer  performance  on  average. 
Confidence  intervals  will  be  used  to  help  determine  if  the  measured  differences  are 
significant  or  not.  An  investigation  will  be  made  into  the  nature  of  the  incorrect 
responses  from  the  observers  as  well.  If  an  observer  cannot  correctly  perform  the  task, 
will  their  incorrect  answers  be  spread  uniformly  from  0  to  9?  Or  will  their  answers  be 
clustered  around  visibly  similar  numbers,  indicating  that  some  useful  information  was 
presented  to  the  observer? 

3. 1.3.1  Image  Preparation 

MATLAB  was  used  to  generate  eye  chart  images  of  numbers  at  six  different  font  sizes. 
The  numbers  were  white  on  a  gray  background  (127  grayscale  value  on  an  8-bit  scale). 
The  clean  number  images  were  split  into  two  source  band  images  randomly  using  the 
process  shown  below: 
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Figure  3-4,  Image  processing  to  generate  two  source  bands  to  fuse 


White  noise  was  effectively  band  pass  filtered  by  using  a  single  layer  of  its  Laplacian 
pyramid  decomposition.  A  threshold  was  then  applied  to  the  resulting  correlated  noise  to 
make  an  extraction  mask  to  apply  to  the  image  of  numbers.  A  complimentary  mask  was 
then  generated  by  subtracting  the  first  mask  from  a  matrix  of  zeros.  The  center  frequency 
of  the  noise  was  chosen  to  make  identification  difficult  with  a  single  source  band  image. 
Noise  with  higher  or  lower  frequency  content  allows  the  number  images  to  be  more 
easily  identified.  The  source  bands  generated  by  the  noise  mask  process  were  then  used 
as  inputs  to  the  image  fusion  algorithms.  Examples  of  the  stimulus  images  presented  to 
the  test  subjects  are  shown  below: 
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Figure  3-5,  Fused  stimulus  images  used  in  the  perception  test 
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3. 1.3.2  Analysis  of  Results 

Ten  observers  participated  in  the  fusion  number  identification  perception  experiment. 
Each  observer  was  shown  40  instances  of  each  image  type  (5  fusion  methods  plus  2 
source  bands)  at  each  of  the  6  font  sizes.  This  resulted  in  a  stimulus  set  of  1680  number 
images  (6x7x40)  that  were  presented  to  each  observer.  The  mean  probabilities  of 
identification  are  plotted  against  the  font  size  in  figure  3-3. 


Number  Identification  Results 
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Figure  3-6,  Probability  of  ID  versus  font  size  and  probability  averaged  over  all  font  sizes 
The  second  graph  shows  the  probability  of  identification  for  each  method  averaged  over 
the  six  font  sizes.  Looking  at  the  graphs,  it  is  clear  that  fusion  was  beneficial  in  this  case. 
Every  fusion  technique  yielded  better  average  performance  than  either  of  the  source 
bands  alone.  It  can  be  seen  that  the  differences  in  performance  on  the  pixel-averaged, 
DWT,  and  shift-invariant  DWT  images  are  very  small.  These  three  techniques  all  seem 
to  out  perform  both  the  PCA  and  Gabor  methods,  but  the  precision  of  the  test  must  be 
quantified  to  ensure  that  this  measured  difference  is  meaningful.  The  results  in  the  bar 
graph  seem  to  agree  well  with  the  previously  measured  observer  preferences.  In  general, 
the  fusion  techniques  that  were  more  preferred  by  the  observers  also  yielded  better  task 
performance  in  the  number  identification  test.  Figure  3-4  shows  a  plot  of  the  previously 
measured  preferences. 
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Preference  Plot  for  Bus  Images 


Figure  3-7,  Preferences  measured  from  a  previous  paired  comparison  test 

Because  the  identification  test  results  correlate  well  with  the  results  of  the  paired 
comparison  test,  it  is  expected  that  the  mutual  information  metric,  the  weighted  FQI,  and 
the  edge-dependent  FQI  will  give  measurements  that  correlate  well  with  the  ID  test 
results.  This  means  that  they  may  be  useful  in  modeling  the  effects  of  image  fusion  on 
observer  task  performance. 


The  confidence  interval  helps  to  indicate  the  precision  of  empirical  data.  Below  are  the 
formulas  to  find  the  mean,  variance,  and  confidence  interval  of  a  set  of  data: 


r ConfLvl ' 

^  (J  ^ 

2  J 

V4n) 

n  ,-=1 


x  ±  erf 

In  these  equations,  x  bar  is  the  mean,  n  is  the  population  size,  xt  is  a  single  measurement, 
ConfLvl  is  the  desired  confidence  level,  and  a2  is  the  variance.  The  95%  confidence 
intervals  were  found  for  this  set  of  data.  The  graph  of  identification  versus  font  size  with 
the  confidence  bars  is  shown  next: 
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Figure  3-8,  Probability  of  ID  versus  font  size  with  95%  confidence  intervals 

The  graph  shows  that  this  test  did  not  measure  a  significant  difference  in  performance 
amongst  the  pixel  averaging,  DWT,  and  shift- invariant  DWT  techniques.  The  confidence 
intervals  for  these  three  methods  have  a  large  overlap  at  almost  every  font  size.  The  PCA 
and  Gabor  confidence  bars  also  overlap  the  other  techniques  at  font  sizes  4,  5,  and  6.  It 
can  be  seen,  though,  that  as  the  font  sizes  decreased  and  the  task  became  more  difficult 
that  the  performance  yielded  by  the  PCA  and  Gabor  techniques  generated  confidence 
bars  that  no  longer  overlapped  the  bars  of  the  other  three  techniques. 

The  probabilities  of  identification  in  this  experiment  give  some  indication  of  how  much 
information  was  present  for  each  combination  of  fusion  methods  and  font  sizes.  Another 
question  remains:  if  an  observer  does  not  correctly  identify  the  stimulus,  does  this  mean 
that  no  information  was  available  to  them  or  simply  that  not  enough  information  was 
available?  The  response  probability  density  functions  may  give  us  some  indication  of 
this.  If  no  information  is  available  from  the  stimulus,  we  would  expect  the  guesses  from 
the  observers  to  either  have  a  uniform  pdf  or  some  pdf  that  does  not  depend  on  the 
stimulus  presented.  The  confusability  matrices  below  use  grayscale  values  to  indicate  the 
measured  probability  of  an  observer  response  in  a  column  when  presented  with  a 
stimulus  in  a  row.  It  is  noted  that  for  an  easy  font  size  that  the  observers  correctly 
identify  the  stimuli  with  very  few  errors.  When  the  numbers  are  presented  at  the  medium 
font  size,  some  stimuli  become  confusable.  Looking  along  row  6  shows  that  observers 
were  likely  to  respond  that  the  presented  6  was  a  5,  8,  or  3.  The  pdf  indicated  along  row 
seven  is  very  different,  though.  Some  observers  may  have  confused  the  7  with  a  2  or  4, 
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but  there  were  no  5,  8,  or  3  responses  when  a  7  was  displayed  at  the  medium  font  size.  It 
seems  that  the  response  pdfs  depend  on  the  stimulus  even  if  an  incorrect  response  is 
made. 
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Figure  3-9,  Confusability  matrices  for  different  font  sizes 

When  sufficient  information  is  present,  identification  can  occur  in  almost  all  trials.  As 
information  is  removed,  peaks  begin  to  form  at  incorrect  positions  of  the  response  pdf  at 
the  locations  of  confusable  stimuli.  In  effect,  the  observers  can  no  longer  discriminate  all 
of  the  stimuli,  but  they  can  discriminate  between  sets  of  confusable  stimuli.  As  even 
more  information  is  removed  these  sets  of  confusable  stimuli  increase  in  size  and 
decrease  in  number  until  pure  guessing  takes  place  amongst  all  of  the  stimuli.  The 
difficult  font  size  shows  that  case  in  which  very  little  information  was  available  to  the 
observers.  The  guessing  was  much  more  uniform  than  in  the  medium  font  case,  resulting 
in  a  mostly  gray  confusability  matrix. 

The  next  graph  shows  the  probability  distribution  functions  for  responses  to  the  displayed 
number  6  for  each  of  the  six  font  sizes.  It  can  be  seen  that  as  the  font  size  decreases  that 
8  quickly  becomes  confusable  with  6.  As  the  font  becomes  smaller,  3  and  0  also  become 
significant  confusers. 
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Responses  to  a  6 


Figure  3-10,  Observer  response  probability  density  functions  for  the  displayed  number  6  at 
different  font  sizes 

The  following  chart  shows  conceptually  how  the  observer’s  task  changes  as  information 
is  removed  and  they  are  forced  to  guess  amongst  sets  of  confusable  stimuli.  It  should  be 
noted  that  the  confusabilities  displayed  in  these  results  are  specific  to  the  font  used  and 
the  group  of  fusion  techniques.  Using  different  fonts  or  a  different  means  of  information 
removal  will  result  in  different  measured  confusabilities. 
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Figure  3-11,  Representation  of  confusability  layers  observed  in  the  number  identification 
perception  test 
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3. 1.3.3  Fusion  Identification  Conclusions 

All  of  the  fusion  techniques  tested  here  improved  number  identification  performance  for 
the  noise-split  source  bands.  The  PCA  and  Gabor  fusion  techniques  yielded  less  observer 
performance  improvement  than  the  DWT,  pixel-averaging,  or  shift-invariant  DWT 
methods.  Image  fusion  is  still  not  sufficiently  modeled  for  use  in  sensor  performance 
prediction.  A  general  fusion  model  is  desirable  because  there  are  many  fusion  techniques 
and  new  ones  are  constantly  developed.  If  the  source  sensors  are  well  characterized, 

Vfused  can  be  found  for  some  fusion  methods  by  finding  the  effective  MTF  of  the  system 
and  the  contrast,  Ch  of  the  fused  image.  The  V5o  for  the  fused  image  set  may  be  less  than 
the  V 50  terms  for  the  source  bands  if  complimentary  information  is  present  in  the  two 
bands.  The  more  unique  and  useful  information  present  in  each  band,  the  more  observer 
performance  can  be  expected  to  improve.  Quantifying  the  amount  of  unique  information 
in  source  bands  and  its  relationship  to  the  change  in  task  difficulty  remains  to  be 
determined.  Understanding  this  relationship  will  make  modeling  of  fusion  for  target 
acquisition  modeling  possible. 

3.1.4  Standard  image  fusion  test  set  suggestions 

In  fulfillment  of  a  request  from  NATO,  a  document  was  prepared  describing 
characteristics  of  a  source  imagery  test  set  for  image  fusion  algorithms.  It  is  meant  to  aid 
researchers  developing  fusion  algorithms  or  developers  selecting  amongst  fusion 
algorithms  for  an  application. 

A  great  number  of  image  fusion  algorithms  are  now  available  to  combine  data  from  two 
or  more  images  into  a  single  fused  image.  When  image  fusion  is  successful,  it  can 
provide  a  single  image  that  is  more  informative  to  an  observer  or  machine  than  the  source 
images.  Also,  it  may  give  context  to  data  in  different  spectral  bands  that  is  not  readily 
detected  by  viewing  the  source  images  individually.  When  image  fusion  is  not 
successful,  it  may  yield  an  output  image  of  significantly  lower  utility  than  the  source 
images. 

How,  then,  can  one  be  assured  that  useful  image  fusion  will  take  place?  Different  fusion 
algorithms  divide  and  compare  the  data  to  be  fused  in  very  different  ways.  Some 
algorithms  sacrifice  contrast  to  increase  edge  visibility  or  vice  versa.  Some  algorithms 
use  false  coloring  to  associate  information  in  the  fused  image  with  one  of  the  source  band 
images.  Other  fusion  techniques  are  more  lossy  in  the  sense  that  source  band  association 
cannot  be  easily  made  by  viewing  the  fused  image.  Different  fusion  methods  amplify  or 
attenuate  noise.  The  effects  of  mis-registration  vary  with  the  fusion  algorithm  as  well. 
The  fusion  algorithm  most  likely  to  produce  useful  results  will  depend  on  the  application 
and  the  quality  of  the  source  imagery.  This  document  will  outline  some  characteristics 
that  should  be  present  in  an  image  fusion  test  set  for  combining  data  from  electro-optical 
sensors. 

3. 1.4.1  Characteristics  to  be  tested: 

A  standard  image  set  for  evaluating  image  fusion  algorithms  should  contain  examples  of 
each  of  the  following  cases.  If  perception  testing  is  not  a  concern,  it  may  be  that  such  an 
image  set  can  be  drawn  from  imagery  that  has  already  been  collected.  A  new  data 
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collection  would  probably  be  necessary  to  obtain  an  image  set  that  both  contains  the 
characteristics  outlined  here,  and  is  suitable  for  task-based  perception  testing. 

3. 1.4. 1.1  Opposing  contrast  in  source  bands 

An  image  fusion  algorithm  should  be  able  to  handle  bands  with  opposing  contrast. 
Visible  and  infrared  images  of  the  same  objects  can  often  have  opposing  contrast.  Some 
algorithms,  such  as  pixel-by-pixel  averaging,  can  produce  output  images  with  much 
lower  target  to  background  contrast  than  the  source  images.  Figure  one  shows  visible 
and  MWIR  images  of  a  military  vehicle.  Each  image  shows  strong  contrast,  but  the 
pixel-averaged  image  shows  very  low  contrast.  The  final  image  is  fused  using  an 
opponent  processing  technique,  and  it  retains  good  contrast. 


Figure  3-12,  clockwise  from  top  left:  visible,  MWIR,  opponent  processed,  pixel-averaged 
3. 1.4. 1.2  Similar  source  bands 

Another  case  that  must  be  handled  gracefully  by  a  successful  fusion  algorithm  is 
very  similar  source  bands.  It  is  often  detrimental  for  a  fusion  algorithm  to  remove  or 
attenuate  information  that  is  readily  available  in  both  of  the  source  bands.  Example 
MWIR  and  LWIR  images  are  shown  in  figure  two.  The  opponent  processing  that  worked 
so  well  for  opposing  contrast  images  eliminates  much  of  the  signal  when  similar  source 
bands  are  combined.  On  the  bottom  and  to  the  right  is  shown  the  discrete  wavelet 
transform  fused  image,  which  contains  much  of  the  redundant  and  unique  information 
from  each  source  band. 
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Figure  3-13,  clockwise  from  top  left:  MWIR,  LWIR,  opponent  fused,  DWT  fused  to  scale  4 

3. 1.4. 1.3  Strong  Content  in  one  band \  not  present  in  the  other 

Similar  and  opposing  contrast  have  been  addressed  in  the  sections  above,  but  the 
ability  to  combine  unique  information  from  two  source  images  is  what  makes  image 
fusion  useful.  Signatures  present  in  only  one  of  the  source  bands  should  also  be  present 
and  not  attenuated  in  the  fused  image.  The  top  of  figure  3  shows  visible  and  MWIR 
images  of  a  tank.  Clearly  present  in  the  visible  image,  but  not  seen  in  the  MWIR  image, 
is  a  solar  reflection  from  the  gun  barrel.  In  the  MWIR,  a  hot  spot  is  seen  on  the  front  of 
the  hull,  which  is  not  indicated  in  the  visible  image.  The  shift-invariant  discrete  wavelet 
fused  image  clearly  contains  both  of  these  signatures.  The  opponent  processed  image 
beside  it  also  contains  these  signatures,  but  at  considerably  lower  contrast. 


Figure  3-14,  clockwise  from  top  left:  visible  image,  MWIR  image,  shift-invariant  DWT 
fused  image,  opponent  processing  fused  image 
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3. 1.4. 1.4.  Noise  in  one  source  band 

When  collecting  and  fusing  data  from  two  sensors,  it  will  often  be  the  case  that 
imagery  from  one  sensor  has  a  higher  level  of  quality  than  imagery  from  the  other  sensor. 
This  may  be  because  the  present  imaging  conditions  produce  more  signal  in  one  band 
than  in  the  other.  Also,  it  may  be  because  one  sensor  is  inherently  better  than  the  other. 
Whatever  the  case,  it  is  possible  that  one  or  both  source  bands  may  contain  noise.  Spatial 
frequency  band  comparing  algorithms  (Laplacian  pyramid,  Gabor,  DWT,  shift-invariant 
DWT,  etc)  will  tend  to  amplify  this  noise.  This  can  result  in  a  fused  image  that  is  less 
useful  than  the  better  of  the  two  source  band  images.  Figure  4  shows  a  very  noisy  I2 
image  along  with  a  relatively  clean  LWIR  image.  The  resulting  Laplacian  pyramid  fused 
image  is  also  very  noisy.  The  multi-scale  fused  image  at  the  bottom  right  combines 
information  from  both  of  the  source  images  while  retaining  less  of  the  noise  from  the  I2 
band. 


Figure  3-15,  clockwise  from  top  left:  I2  image,  LWIR  image,  multi-scale  fused  image, 
Laplacian  pyramid  fused  image 

3. 1.4. 1.5  Mis-registration  handling 

Image  registration  is  a  significant  issue  in  image  fusion.  It  is  ideal  to  design  the 
system  so  that  both  sensors  share  an  optical  axis,  have  similar  optics,  similar  detector 
resolution,  and  collect  imagery  simultaneously.  This  is  often  infeasible,  and  imagers  are 
instead  placed  close  to  each  other  with  somewhat  similar  fields  of  view.  This  introduces 
registration  and  parallax  errors.  In  many  image  fusion  applications,  the  fusion  algorithm 
will  regularly  need  to  handle  images  that  are  somewhat  mis-registered.  In  figure  5,  two 
images  of  squares  were  mis-registered  with  each  other  by  2  columns  and  rows.  They 
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were  then  fused.  The  DWT  fused  image  on  the  left  of  figure  5  shows  stronger  double¬ 
edging  due  to  mis-registration  than  the  multiscale  fused  image  on  the  right. 


Figure  3-16,  scale  2  DWT  fusion  on  the  left,  multiscale  fusion  on  the  right 
3. 1.4.1. 6  Differing  spatial  resolution 

Images  to  be  combined  will  often  have  different  spatial  resolution.  Prior  to 
fusion,  the  smaller  of  the  images  must  be  up-sampled  and  low-pass  filtered.  The  result  is 
two  source  band  images  that  may  have  significantly  different  spatial  frequency  cutoffs. 
For  this  case,  successful  fusion  will  use  the  high  frequency  edges  from  the  higher 
resolution  band,  but  will  incorporate  information  from  both  bands  at  the  lower  spatial 
frequencies.  The  opponent  processing  fused  image  in  figure  6  has  attenuated  some  of  the 
detail  easily  seen  in  the  visible  source  image.  Also,  strong  signature  from  the  hot  tank 
tracks  see  in  the  MWIR  image  are  not  well  represented.  In  the  multiscale  fused  image, 
both  the  hot  tank  tracks  form  the  MWIR  and  the  fine  details  from  the  visible  image  are 
seen. 


Figure  3-17,  clockwise  from  top  left:  visible  image,  MWIR  image  up-sampled  lOx  using 
bicubic  interpolation,  multiscale  fused  image,  opponent  processing  fused  image 

3. 1.4.2  Thoughts  on  using  a  standard  image  fusion  test  set 

The  fused  imagery  produced  using  this  source  set  can  then  be  evaluated  using  criteria 
appropriate  for  the  application.  For  some  applications  fusion  quality  or  image  quality 
metrics  may  yield  sufficient  information.  For  imagery  to  be  presented  to  human 
observers  it  may  be  necessary  to  perform  a  perception  test  using  the  standard  imagery.  It 
also  may  be  known  for  a  given  application  that  some  of  the  above  cases  will  not  be 
encountered.  As  new  fusion  algorithms  are  developed,  their  strengths  and  weaknesses 
should  be  found  by  quantifying  their  performance  on  an  image  set  containing  the  cases 
described  here.  It  may  be  deemed  appropriate  in  the  future  to  add  other  important  source 
band  cases  to  this  list  as  well. 

3.2  Milestone  Two  (2007):  Modify/adapt  existing  sensor 
models  to  account  for  image  processing  enhancements  in 
ISR  sensors . 

3.2.1  Classification  of  multispectral  images  using  filters  derived  from 
Independent  Component  Analysis 

In  this  research  we  implemented  an  Independent  Component  Analysis  (ICA)  based  data- 
dependent  filter  for  multispectral  texture  feature  extraction  and  applied  a  unique  method 
to  fuse  the  extracted  features.  Combined  space  spectral  filters  were  derived  from  ICA 
basis  functions  of  multispectral  MWIR  images  which  were  acquired  in  a  field  data 
collection.  These  filters  were  used  to  extract  features  from  multispectral  MWIR  images. 
The  space  spectral  features  were  fused  based  on  the  phase  relationship  among  basis 
functions  of  the  corresponding  bands.  The  classification  performance  of  these  filters  was 
shown  to  be  superior  to  those  derived  from  multispectral  Principal  Component  Analysis 
(PCA)  basis  functions  and  opponent  plus  unichrome  Gabor  filtering.  With  no-added 
noise  in  test  images,  the  performances  of  the  ICA-based  filters,  PCA-based  filters,  and 
opponent,  plus  unichrome  Gabor  techniques  were  similar.  However  the  ICA-based  filters 
reached  their  highest  classification  rate  with  fewer  features  than  PCA  and  Gabor  filters. 
With  noisy  test  images,  the  ICA-based  filters  performed  better  than  opponent,  plus 
unichrome  Gabor  and  multispectral  PCA-based  filters.  Therefore,  it  was  a  demonstrated 
that  the  performance  of  ICA  based  filters  is  better  in  terms  of  having  higher  classification 
rates  and  also  needing  fewer  features  for  attaining  those  classification  rates.  For  higher 
SNR  values,  the  performance  of  multispectral  PCA  was  better  than  opponent,  plus 
unichrome  Gabor,  but  with  reduced  SNR  opponent,  plus  unichrome  Gabor  overtook 
multispectral  PCA  in  performance.  The  superior  performance  of  our  approach  can  be 
attributed  to  the  following  two  strategies  used:  (1)  Combined  space  spectral  ICA  based 
filters  for  multispectral  feature  extraction  (2)  feature  fusion  technique  controlled  by  the 
phase  relations  among  ICA  basis  functions  which  captures  the  statistical  decorrelation 
process.  ICA  based  filters  provide  a  good  balance  between  two  popular  approaches  to 
filter  design  for  feature  extraction:  one  which  use  filters  with  good  joint  spatial  and 
frequency  resolution  and  the  other  which  picks  filters  parameters  based  on  the  data.  For 
our  data  set,  the  ICA  basis  functions  not  only  had  joint  spatial  and  frequency  resolution 
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properties  similar  to  (not  the  same  as)  Gabor  filters,  they  also  showed  slightly  more 
complex  (richer)  frequency  responses  that  were  learned  from  the  data  itself.  On  the  other 
hand,  the  PCA  basis  functions  were  also  data-dependent  with  rich  frequency  responses, 
but  they  did  not  show  good  joint  spatial  and  frequency  resolution. 

The  research  under  this  milestone  has  also  been  accepted  for  publication  in  the  Optical 
Engineering  Journal 

3.2.2  Survey  of  denoising  techniques  for  image  enhancement 

This  section  discusses  denoising  techniques  for  image  enhancement  available  in  open 
literature.  The  algorithms  described  can  be  used  to  process  images  obtained  from 
Intelligence  Surveillance  Reconnaissance  (ISR)  imaging  systems.  The  report  provides  a 
general  overview  with  out  attempting  to  address  problems  related  specifically  to  ISR 
imagery.  Interfering  natural  phenomena,  imaging  and  data  acquisition  instruments  can 
contribute  to  noise  in  the  image  displayed  to  the  end  user.  The  presence  of  noise  in 
images  can  hinder  the  image  understanding  capabilities  of  the  end  user,  be  it  humans  or 
some  form  of  artificial  intelligence.  Hence  denoising  of  images  is  a  very  important 
image  enhancement  tool.  Denoising  techniques  can  be  broadly  classified  in  two  main 
categories:  the  spatial  domain  denoising  and  transform  domain  denoising. 

Spatial  filters  act  as  smoothing  operators  on  the  image.  The  noise  which  shows  rapid 
spatial  variations  can  hence  be  eliminated.  Spatial  filters  can  be  categorized  as  linear  and 
non-linear  filters.  Linear  filters  like  mean  filters  calculate  the  replaces  each  pixel  by  the 
mean  of  its  neighborhood,  while  non-linear  filters  such  as  median  filters  replace  each 
pixel  by  the  median  of  its  neighborhood.  Wiener  filters  are  linear  filters  which  can 
estimate  the  original  signal  in  the  presence  of  additive  noise.  Wiener  filters  require  that 
the  spectral  characters  of  the  original  signal  and  the  noise  or  their  autocorrelation  and  the 
cross  correlation.  Wiener  filter  estimate  of  the  original  signal  minimizes  the  mean  square 
error  between  the  original  signal  and  the  estimate. 

Domain  transformation  involves  representing  an  image  (or  any  signal)  as  a  linear  or  non¬ 
linear  combination  of  basis  functions.  The  new  representation  can  provide  insights  into 
the  characteristics  of  both  the  actual  signal  and  noise  depending  on  the  properties  of  the 
basis  functions  used.  Basis  functions  can  either  be  data  dependent  (learned  from  the 
data)  or  can  be  fixed  (predetermined  or  data  independent).  Data  dependent 
transformations  involve  identifying  both  the  coefficients  as  well  as  basis  functions.  Data 
dependent  transformations  such  as  Independent  Component  Analysis  (ICA)  and  Principal 
Component  Analysis  (PCA)  focus  on  finding  basis  functions  and  coefficients  so  as  to 
maximize  the  sparseness  of  the  representation.  The  sparseness  is  achieved  by  applying 
statistical  constraints  in  the  transform  domain  (independence  for  transformed  variables  in 
ICA  and  decorrelation  of  transformed  variables  in  PCA).  Sparse  representations  suggest 
that  the  actual  (noise  free)  signal  can  be  represented  by  a  smaller  number  of  basis 
functions  than  the  measured  signal,  and  the  bases  that  can  be  ignored  are  very  like  the 
noise  components.  For  example  in  PCA  the  eigenvectors  of  the  data  covariance  matrix 
are  the  basis  functions.  The  data  is  first  represented  in  the  PCA  domain.  During  the 
inverse  transformation  process  (from  PCA  domain  back  to  spatial  domain),  eigenvectors 
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(basis  functions)  corresponding  to  small  eigenvalues  are  not  used.  Thus  spatial  image  is 
now  reconstructed  without  the  noise  components.  The  one  drawback  of  such  data 
dependent  denoising  techniques  is  that  they  are  computationally  expensive. 

When  a  predetermined  set  of  basis  functions  is  used,  the  transform  calculates  the 
coefficients  need  to  scale  those  basis  functions.  The  denoising  capability  of  these 
transforms  is  strongly  dependent  upon  the  sparseness  of  their  representation  of  a  signal. 
When  it  is  known  that  a  transform  provides  sparse  representation,  it  can  be  inferred  that 
only  a  few  of  the  basis  functions  represent  the  actual  signal.  This  means  that  for  the 
actual  signal  representation  only  a  small  number  of  transform  coefficients  are  non-zero. 
This  sparseness  of  the  wavelet  bases  is  what  makes  wavelet  transform  so  effective  in 
denoising.  Wavelet  decomposition  provides  a  joint  spatial- spatial  frequency 
representation.  The  spatial  frequency  part  of  the  representation  is  made  in  terms  of  the 
variable  referred  to  as  scale  which  is  the  inverse  of  the  spatial  frequency.  Thresholding 
of  wavelet  coefficients  to  suppress  certain  basis  functions  is  a  common  technique  in 
wavelet  denoising  and  is  referred  to  as  wavelet  shrinkage.  Hard  thresholding  involves 
setting  coefficients  with  values  less  than  the  selected  threshold  to  be  equal  to  zero  and 
leaving  the  large  coefficients  intact.  In  soft  thresholding,  the  coefficients  less  than  the 
threshold  are  set  to  zero.  The  coefficients  higher  than  the  threshold  are  forced  towards 
zero  with  values  determined  by  x/(x),  where  x  =  (coefficient  -  threshold)  and f{.)  is  a 
function  for  creating  the  required  mapping.  The  identification  of  an  appropriate  threshold 
value  can  affect  the  performance  of  the  denoising  process.  The  Stein’s  unbiased  risk 
estimator  (SUREshrink)  and  the  VISUshrink  (provides  best  performance  in  terms  of 
Mean  Square  Error  when  the  data  points  reaches  infinity)  are  two  popular  threshold 
selection  techniques.  Denoising  of  images  can  also  be  accomplished  by  using  an  optimal 
wavelet  tree  representation  which  relates  to  ordering  of  terms  in  the  wavelet  expansion. 

In  this  technique  the  wavelet  decomposition  is  represented  as  a  tree  with  each  level 
representing  the  scale  and  each  node  representing  a  coefficient.  A  large  value  child 
coefficient  (node)  can  be  assumed  to  be  a  noise  if  its  parent  nodes  are  not  well 
pronounced.  Some  techniques  also  use  statistical  properties  of  coefficients  such  as  their 
probability  distributions  and  their  inter-scale  and  intra-scale  correlations  to  separate 
signal  coefficients  from  noise  coefficients. 

3.2.3  Conclusions: 

This  report  is  a  survey  of  denoising  techniques  for  image  enhancements.  In  most  cases, 
the  end  user  of  ISR  images  are  humans  involved  in  tactical  discrimination  tasks.  The 
problem  with  identifying  a  suitable  denoising  technique  for  ISR  sensors  is  unavailability 
of  human  image  perception  evaluation  of  the  denoising  techniques  (specifically  for  ISR 
imagery).  Hence  future  research  should  focus  on  human  psychovisual  studies  to  compare 
various  denoising  algorithms  in  terms  of  detection,  recognition  and  identification 
performance  for  ISR  imagery. 

3.2.4  References: 

M.  Motwani,  M.  Gadiya,  R.  Motwani,  and  F.  Harris,  “A  Survey  of  Image  Denoising 
Techniques,”  in  Proceedings  of  GSPx  2004,  September  27-30,  2004,  Santa  Clara 
Convention  Center,  Santa  Clara,  CA. 
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4.0  Milestone  -  Performance  Modeling  of  Advanced 
Architecture  Systems  (Dr.  Steven  Griffin) 

Prior  performance  models  are  physics  based  descriptions  of  the  components  and 
subsystems  that  form  images  in  the  non-terahertz  regime.  These  are  generally  either 
infrared  (IR)  or  visual  optical  modeling  systems  or  electromagnetic  field  (EMF)  based 
modeling  systems  in  the  millimeter  wave  (mmwave)  or  at  radio  frequencies  (RF). 
Though  the  terahertz  (THz)  regime  has  many  characteristics  in  common  with  these 
neighboring  regions,  some  aspects  are  significantly  different.  After  extensive  research 
and  a  number  of  meetings,  a  general  approach  to  THz  modeling  was  established.  Prior 
details  were  summarized  in  the  three,  six,  nine,  twelve  and  fifteen  month  reports.  Details 
of  the  completion  of  milestone  one,  two  and  three  were  reported  in  the  progress  reports. 

This  work  compliments  and  enhances  existing  Army  Research  Laboratory  (ARL) 
programs  by  concentrating  on  selected  issues  of  mutual  interest.  This  includes 
corrections  for  deviations  from  the  classical  optical  performance  model  descriptions.  On 
the  first  year  proposal  cycle,  three  milestones  received  particular  emphasis. 

4. 1  Support  performance  modeling  of  metal  and  plastic  target 
reflectivity  and  emissivity. 

Direct  interaction  with  ARL  Maryland  and  NVESD  Virginia  was  used  to  establish  the 
current  state  of  the  art  with  regard  to  target  reflectivity  and  emissivity  with  a  view 
towards  advancing  performance  modeling.  Modification  of  existing  performance  models 
was  accomplished  by  reviewing  existing  ARL  /  NVESD  based  models  with  existing 
classic  descriptions  such  as  Donovan’s  and  by  the  incorporation  of  additional  empirical 
data.  Significant  differences  and  causative  factors  were  used  to  determine  performance 
modeling  recommendations.  This  work  supports  the  Army’s  performance  modeling 
efforts  and  improves  performance  predictions  or  decisions  resulting  from  those 
simulations. 

To  accomplish  this,  four  actions  were  proposed.  The  third  action  was  modified  with 
ARL  approval  and  coordinated  with  recent  equipment  and  personnel  changes.  All  have 
been  completed. 

Enhancement  of  model  for  reflectivity  and  emissivity  of  metals  from  prior  ARL  / 
NVESD  modeling  and  classic  systems  is  complete.  Donovan’s  theory  of  metals 
[Donovan,  1967]  has  proven  to  be  very  durable  for  the  modeling  of  metals  at  high 
frequency  and  is  recommended  for  incorporation  into  the  radiometric  description.  This 
required  the  generation  of  a  MatLab  implementation  of  the  model.  Inquiries  at  various 
libraries  and  universities  in  the  Washington,  D.C.,  Knoxville,  TN  and  Memphis,  TN  areas 
plus  assorted  research  facilities  of  the  army  (ARL,  NVESD,  etc.)  were  incorporated. 
Responses  from  the  University  of  Illinois  and  University  of  Massachusetts  at  Lowell 
were  also  utilized. 
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The  University  of  Memphis  is  obtaining  additional  THz  equipment  from  the  Navy  (ONR 
-  DURIP)  supportive  of  data  collection  for  validation  purposes  but  delivery  has  been 
delayed.  After  consultation  with  the  ARL  contract  monitor,  modification  to  the 
investigation  reduced  travel  expenditures  for  Jefferson  Lab.  This  revised  plan  was 
implemented  to  accommodate  post  proposal  reductions  in  funding  available.  To 
accommodate  the  net  effect  of  this  validation  data  was  obtained  from  third  parties. 

In  addition,  information  regarding  plastics  modeling  was  provided  by  the  Redstone 
Scientific  Information  Center  (RSIC).  The  principle  development  for  plastics  was  the 
inclusion  of  the  water  absorption  common  to  most  plastics.  Because  water  has  such  a 
strong  interaction  at  THz  frequency  even  small  amounts  included  in  the  body  of  plastic 
can  overwhelm  other  factors  -  not  only  changing  the  loss  tangent  but  also  possibly 
producing  and  /  or  dominating  frequency  dependences.  This  occurs  because  the  nature  of 
most  polymers  creates  molecular  lumps  of  long  molecular  chains.  Water,  driven  by  time 
and  temperature,  can  migrate  along  the  boundaries  and  into  these  lumps  producing 
plastics  that  not  only  contain  significant  water  percentages  by  weight  but  whose 
percentage  is  a  strong  function  of  hydrometric  history.  Some  plastics  have  actually  been 
designed  to  eliminate  this  effect.  An  example  of  this  is  cross-linked,  fused  polystyrene, 
which  is  used  by  the  electronics  industry  at  high  frequencies  specifically  because  it 
excludes  water. 

This  effect  should  not  be  under  estimated.  Some  nylons  contain  30%  or  more  water  by 
weight  -  depending  on  its’  history.  The  rate  of  water  uptake  and  release  is  a  strong 
function  of  surface  to  volume  ratios  and,  as  a  result,  particularly  pronounced  for  fabric 
fibers.  Denier,  stress,  and  hydrometric  history  impact  the  water  weight  percentage  in 
cloth.  This  created  an  overlap  with  allied  THz  efforts  under  independent  sponsorship. 
Efforts  have  been  made  to  coordinate  and  leverage  with  those  programs. 

Complete  documentation  of  technical  results  is  contained  under  separate  cover.  Below  is 
a  summary  of  some  pertinent  details. 

4.1 .1  Reflectivity  and  emissivity  of  metals 

Reflectivity,  emissivity,  and  absorption  are  intimately  related  quantities.  Radiometric 
system  modeling  provides  some  standard  representations  which  are  utilized  here. 

Starting  with  a  given  complex  index  of  refraction  yields: 
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MatLab  based  simulations  can  be  made  of  the  complex  index  of  refraction,  reflections 
and  emissivity  for  metals  and  plastics  versus  frequency  across  a  broad  band.  As  an 
example,  consider  the  broad-band  description  of  the  copper  complex  index  in  Figure  4- 
1A.  In  this  figure,  common  regions  are  labeled:  radio  frequency  and  microwave  (RF), 
millimeter  and  sub-millimeter  wave  (MM-wave),  terahertz  (THz),  long  wave  infrared 
(LLIR),  medium  wave  infrared  (MWIR),  visible  wavelengths  (Vis),  and  X-ray  (X-ray). 
In  the  THz  region  of  the  figure,  both  the  real  and  imaginary  components  of  the  complex 
index  of  refraction  begin  to  deviate  from  a  polynomial  description  and  separate  from  each 
other.  This  is  due  to  effects  associated  with  the  plasma  frequency  and  the  departure  from 
the  Boltzmann  non-relaxation  assumption.  Neither  effect  is  normally  included  in 
modeling  metals  in  the  millimeter  and  sub-millimeter  region.  They  must,  however  be 
included  in  the  THz  description. 


A  similar  description  can  be  constructed  for  copper  reflectivity  and  is  presented  in  Figure 
4- IB.  Validation  of  the  results  was  accomplished  by  comparison  of  modeling  prediction 
with  actual  data  from  a  variety  of  sources.  Such  phenomenological  data  is  scarce  for  the 
THz  region  and  tends  to  be  concentrated  at  the  lower  frequencies.  [Bolivar, 2003] 
Fortunately,  due  to  atmospheric  window  issues,  this  is  the  region  of  most  interest  for 
current  applications.  In  particular,  available  data  is  at  or  near  650  GHz  and  is  plotted  in 
figure  4-2A.  Multiple  sources  yield  the  data  represented  by  the  “X”.  Notice  the 
extremely  fine  scale  of  the  vertical  axis.  Values  have  been  determined  to  three 
significant  places  where  subtle  theoretical  corrections  have  been  applied  to  actual 
readings.  The  error  bar  associated  with  the  “X”  is  approximately  three  times  its’  size  on 
this  figure’s  scale.  As  can  be  seen,  the  model’s  predictions  agrees  within  the  error-bar  of 
the  mean  (X)  of  the  empirical  data  available  as  of  date.  Similar  curves  can  be  generated 
for  other  metals. 


A  similar  comparison  between  modeling  predictions  of  copper  emissivity  and  the 
empirical  data  available  for  the  THz  produces  Figure  4-2B.  The  same  comments  made 
previously  still  apply  to  this  case.  Additional  metals  and  plastics  may  also  be  modeled. 
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(A)  Real  (n)  and  imaginary  (k)  components  of  copper  complex  index  of  refraction 
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(B)  Copper  power  reflectivity  across  a  broad  band 
Figure  4-1  Intermediate  Computation  of  Metal  Behavior 
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Power  Reflectivity  as  Fraction 
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(A)  Power  reflectivity  for  copper  with  available  data  over-plotted 
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(B)  Modeling  prediction  of  copper  emissivity  versus  available  data 


Figure  4-2  Reflectivity  and  Emissivity  of  Copper 
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4.1 .2  Reflectivity  and  emissivity  of  plastics 

At  lower  frequencies,  plastics  are  generally  effective  dielectrics  with  relatively  low 
losses.  As  the  THz  regime  is  approached,  a  number  of  additional  factors  become 
significant.  Being  complex  organic  compounds,  various  carbon  chain  resonances  and 
side  chain  molecular  absorption  modes  can  become  factors.  However,  to  the  first  order 
these  interactions  tend  to  be  over  whelmed  by  a  practical  consideration.  Plastics  typically 
incorporate  a  significant  mass  of  water  in  their  internal  structure  [Simonds,1949]. 
Chemical  molar  analysis  indicates  the  water  density  is  consistent  with  a  liquid  state.  This 
water  is  in  equilibrium  with  the  environmental  sources.  The  water  migrates  along 
macromolecular  boundaries  into  the  body  of  the  plastic.  Most  plastic’s  macromolecules 
are  not  linear  but  rather  resemble  a  ball  of  string.  Environmental  water  may  penetrate  not 
only  the  body  of  the  plastic  material  but  also  penetrates  the  folds  and  creases  of  the 
macromolecule.  Normally,  inclusion  of  water  into  a  solid  matrix  would  dramatically 
alter  the  water’s  electrical  properties.  For  plastics  however,  the  covalent  nature  of  most 
plastics  and  the  extremely  polar  nature  of  the  water  molecule  limit  that  effect.  The  nature 
of  the  water  inclusion  is  surface  absorption  along  the  intra-  and  intermolecular  interfaces. 
A  process  so  pronounced,  that  some  common  plastics  under  some  circumstances  can  be 
10’s  of  percent  water  by  mass.  To  control  this  variable,  electrical  standard  testing 
procedures  prescribe  a  standard  humidity  /  temperature  treatment  prior  to  the 
determination  of  dielectric  constant,  etc.  [Brown, 2000].  The  time  constant  for  the  water 
uptake  and  devolution  varies  from  minutes  to  days  depending  on  the  particulars  of 
material  and  environment.  The  standard  electrical  test  procedure  is  intended  to  produce 
equilibrium  prior  to  the  determination  of  electrical  properties.  Because  of  the  long  time 
constants  involved  in  the  water  absorption  process,  practical  plastics  exhibit  hysteresis 
with  their  current  state  determined  by  environmental  history  as  well  as  material 
properties. 

At  THz  frequencies,  water  tends  to  dominate  the  electrical  properties  intrinsic  to  the 
plastic.  Thus  the  actual  field  behavior  of  plastics  under  THz  illumination  is  a  function  of 
all  the  previously  named  parameters.  A  few  plastics  avoid  this  water  dominated  behavior 
[Vasile,2000].  These  tend  to  be  cross-linked  plastics,  where  side  chain  linkages  between 
adjacent  macromolecules  impede  water  absorption.  A  classic  example  of  this  is  cross- 
linked  polystyrene  which  is  a  low  loss,  transparent  material  from  DC  through  THz  up  to 
some  near  IR  hydrocarbon  absorption  lines.  Past  these  lines,  it  regains  its  optical  clarity 
until  electronic  transition  lines  are  encountered  in  the  near  UV.  As  such,  it  makes  an 
excellent  material  for  THz  optics.  It  is,  however,  a  rare  exception.  Most  plastics  contain 
significant  quantities  of  water  which  dominate  the  absorption  characteristics  and  affect 
the  total  index  of  refraction.  Water  tends  to  be  distributed  at  a  scale  which  is  small 
compared  to  the  wavelength  associated  with  the  THz  region.  The  electromagnetic  field 
cannot  see  the  fine  details  but  rather  sees  the  water’s  loss  at  the  appropriate  mole  fraction 
and  an  index  intermediate  between  water  and  plastic.  The  index  is  not  greatly  modified 
because  the  typical  plastic  and  water  index  are  not  terribly  different  to  begin  with  and  the 
mole  fraction  of  water  is  typically  a  few  percent. 
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Taking  this  into  account,  a  simple  complex-index-based  model  treats  the  plastic’s  losses 
(imaginary  component)  as  trivial  and  the  index  (real  component)  is  prorated  between  the 
plastic  and  water  values.  The  plastic’s  effective  loss  is  simply  a  molar  prorated  version 
of  the  water  loss.  Details  of  the  incorporation  and  structure  of  the  model  parallel  that  for 
metals  above.  Crucial  to  this  is  the  assumed  molar  water  content  of  the  plastic.  Since 
this  parameter  cannot  be  determined  in  an  application-meaningful  way,  it  is  left  as  an 
input  and  will  probably  be  set  to  a  typical  value.  Because  THz  imaging  depends  on 
relative  (not  absolute)  contrast,  this  is  not  a  major  modeling  limitation.  The  results  are 
simple  equations  using  nr  =  1.6  and  ni  =  0.6  as  typical  values. 
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\\ 
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Where  N  is  the  complex  index, 

FW  is  the  molar  fraction  of  water, 

FWS  is  the  molar  fraction  of  water  at  standard  conditions,  and 

Np  is  the  complex  index  of  refraction  for  the  plastic  under  standard  conditions. 

4.1.3  Conclusions  regarding  metals  and  plastics 

Metals  and  plastics  can  be  modeled  effectively  in  the  THz  frequency  regime.  Metal 
modeling  requires  release  from  the  Boltzmann  non-relaxation  assumption  and  inclusion 
of  plasma  frequency  effects  differentiating  it  from  the  assumptions  typically  made  at  sub¬ 
millimeter  frequencies  and  below.  Plastics  are  dominated  by  water  behavior  to  the  point 
that  hydrological  history  and  water  material  properties  over  whelm  the  intrinsic  behavior 
of  plastic  within  the  THz  band.  The  models  compare  favorably  with  the  available 
empirical  data. 
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Figure  4-3  Complex  Index  for  water  dominated  plastic 
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4.2  Support  performance  modeling  of  water  and  human  target 
reflectivity  and  emissivity . 

Modification  of  existing  performance  models  was  accomplished  by  reviewing  existing 
ARL  /  NVESD  based  models  with  previously  assembled  existing  classic  descriptions. 
Limited  empirical  validation  was  developed  at  the  University  of  Memphis  as  compatable 
with  the  delayed  delivery  of  the  ONR  DURIP  equipment.  Significant  differences  and 
causative  factors  were  used  to  determine  performance  modeling  recommendations.  This 
work  should  enhance  the  Army’s  performance  modeling  efforts  and  improve  any 
performance  predictions  or  decisions  resulting  from  those  simulations. 

To  accomplish  this,  four  actions  were  proposed.  The  third  action  was  modified  with 
ARL  approval  and  coordinated  with  recent  equipment  and  personnel  changes. 

Enhancement  of  model  for  reflectivity  and  emissivity  of  water,  ice,  and  snow  from  prior 
ARL  /  NVESD  modeling  and  classic  systems  such  as  Ray  is  complete.  Porting  of 
MatLab  code  to  the  new  University  of  Memphis  High  performance  Cluster  (HPC)  has 
been  delayed  as  the  software  is  just  now  becoming  available.  Inquiries  at  the  University 
of  Memphis,  Vanderbilt  and  University  of  Tennessee  (Knoxville)  libraries  have  produced 
considerable  phenomenological  detail  for  incorporation  of  the  model  for  water. 
Reduction  in  travel  funds  has  eliminated  Jefferson  Laboratory  involvement  as  previously 
cited. 

The  incorporation  of  human  tissue  into  the  reflectivity  and  emissivity  models  based  on 
prior  biologic  measurements  is  complete.  The  University  of  Tennessee  Medical  Library 
(Memphis)  has  been  consulted.  Some  additional  information  was  obtained  from 
Vanderbilt.  Incorporation  of  this  into  MatLab  emphasized  the  dominate  role  of  water. 
The  outer  skin  layers  are  relatively  transparent  to  THz  energy.  A  fact  utilized  to  develop 
a  number  of  biological  sensors  for  the  upper  and  lower  skin  layers.  Since  the  interior 
body  layers  are  essentially  a  matrix  containing  predominately  water  and  the  water 
interactions  are  so  strong,  the  best  first  order  model  is  a  body  of  water  situation  a  few 
millimeters  below  the  skin  surface.  Originally  human  skin  validation  was  to  be 
undertaken  using  approximately  one  square  inch  sample  of  living  tissue  as  a  THz  target 
Given  the  delays  in  the  receipt  of  the  ONR  DURIP  equipment,  the  dominate  nature  of 
water  in  the  tissue  and  the  minor  impact  of  the  body  matrix  on  the  first  order  results,  it 
was  decided  to  forgo  the  live  tissue  sample. 

4.2.1  Reflectivity  and  emissivity  of  water,  ice  and  snow 

As  the  second  year  unfolded  it  became  increasingly  obvious  that  the  domination  of  THz 
spectra  by  the  various  states  of  water  rendered  more  detailed  analysis  mute.  Specifically 
anything  that  could  contain  water  -  particularly  if  the  water  percentage  varied  depending 
on  history  -  would  have  its’  characteristics  over  whelmed  by  the  included  water.  Water 
is  such  a  strong  absorber  in  the  THz  regime  that  the  contribution  due  to  the  matrix  that 
holds  the  water  is  trivial.  As  such,  thorough  characterization  of  water  was  essential,  not 
only  in  its’  own  right  but  because  it  tends  to  dominate  the  behavior  of  many  materials 
such  as  plastics,  biological  tissues  including  human  tissues,  etc.  Metal  stands  out  in  THz 
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images  partly  because  it  naturally  excludes  water.  Thus,  a  relative  contrast  is  generated 
against  a  soggy  background. 

In  the  THz  regime  the  assumptions  that  work  for  far  Infrared  and  the  microwave  are  in 
transition.  This  occurs  for  water,  ice,  and  snow  in  a  manner  analogous  to  that  cited  in 
section  4.1  for  metals.  Thus  the  limiting  case  normally  used  to  describe  water  in  the 
microwave  and  far  infrared  regions  is  not  adequate  for  the  THz.  The  fundamental 
physical  mechanisms  are  in  transition  from  one  limit  to  another  and  all  have  to  be 
included  for  an  accurate  description.  The  electrical  conductivity  dominated  behavior, 
small  dielectric  losses  and  impurity  issues  that  dominate  the  low  frequencies  give  way  to 
the  classic  Debye  relaxation  description  in  the  microwave  band.  In  the  far  infrared, 
molecular  absorption  lines  dominate  the  Debye  description.  In  the  THz,  the  Debye 
description  is  significantly  affected  by  the  tails  of  the  higher  frequency  absorption  lines 
and  neither  can  be  ignored.  In  Figure  4-4  is  a  summary  plot  of  a  computer  model  for  the 
real  (n)  and  imaginary  (k)  components  of  the  complex  index  of  refraction  for  water  at 
25°C. 
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Figure  4-4  Complex  index  of  refraction  of  water  at  25°C 


The  figure  is  highly  compressed;  however,  the  model  is  capable  of  expanding  any  chosen 
range  a  fine  scale.  THz  frequencies  correspond  to  approximately  0.1  cm  wavelengths. 
By  referring  to  Figure  4-4,  it  can  be  seen  that  the  total  index  components,  both  real  and 
imaginary,  begin  to  deviate  from  the  classic  Debye  description  in  the  THz  band.  This  is 
the  result  of  the  impacts  of  the  tails  of  the  resonant  lines  at  higher  frequencies  and  should 
not  be  neglected.  The  complex  index  was  chosen  to  represent  the  emissivity  information 
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because  it  lends  itself  to  application  across  a  broad  range  of  frequencies.  The  complex 
index  of  refraction  can  be  converted  to  reflectivity  and  emissivity  by  standard  techniques. 

R=  b1  +  ”)  —  E  =  \-R  Assumes  T  =  0 
(l  +  nf  +k2 

Where  R-Reflectvity,  E-Emissivity,  T-Transmission 

Applying  this  gives  Figure  4-5  -  again  for  liquid  water  at  25°C.  It  should  be  noted  that 
the  characteristics  of  all  the  states  of  water  are  functions  of  temperature.  Different 
temperatures  of  ice,  for  instance,  demonstrate  different  emissivities.  This  is  a 
characteristics  sometimes  utilized  for  remote  sensing.  Such  behavior  may  result  in 
differential  contrasting  of  different  temperature  water  in  THz  images  due  to  variations  in 
power  reflectivity  as  seen  in  Figure  4-5  A 

THz  again  falls  at  the  transition  point  where  the  emissivity  and  reflectivity  is  changing 
rapidly  with  respect  to  frequency  and  temperature.  For  space  limitation  reasons,  the 
temperature  figures  are  not  included  but  are  published  under  separate  cover.  Similar 
behavior  can  be  noted  with  respect  to  ice.  Once  again,  the  temperature  of  the  ice  has  a 
significant  impact  on  the  complex  index  of  refraction  and  as  a  consequence  the 
reflectivity  and  emissivity. 

One  of  the  impacts  of  the  strong  dependence  of  emissivity  on  frequency  pertains  to  the 
swept  frequency  systems.  A  common  motivation  for  sweeping  frequency  is  to  modify 
artifacts  caused  by  phase  interference  associated  with  imaging  using  narrow  frequency 
bandwidths.  Most  current  THz  sources  for  active  imaging  are  naturally  narrow  in  band 
width  -  sometimes  made  more  so  by  atmospheric  transmission  characteristics.  Because 
the  wavelength  at  THz  frequencies  is  on  the  order  of  0. 1  millimeters,  details  such  beard 
stubble  can  produce  dramatic  image  effects.  These  narrow  bandwidth, 
(-monochromatic)  temporally  coherent  effects  can  be  reduced  by  sweeping  the  frequency 
at  rate  fast  compared  to  the  system  integration  time.  Unfortunately,  this  will  make  the 
reflectivity  and  emissivity  also  sweep  through  values.  The  impact  of  this  for  water 
dominated  materials  such  as  beard  stubble  is  not  know  to  the  author  and  is  a  topic  of 
potential  future  study. 
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Emissivity  as  Fraction  Power  Reflectivity  as  Fraction 


(A)  Reflectivity  of  water  at  25°C 


Emissivity 


(B)  Emissivity  of  water  at  25°C 


Figure  4-5  Reflectivity  and  Emissivity  of  Water  at  25°C 
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Similar  behavior  can  be  noted  with  respect  to  ice.  Once  again  the  temperature  of  the  ice 
has  a  significant  impact  on  the  complex  index  of  refraction.  In  Figure  4-6,  the  complex 
index  of  refraction  for  ice  at  0°C  is  plotted  versus  wavelength.  In  this  case  the  real 
portion  of  the  index  is  relatively  stable  because  of  the  extremely  weak  effect  of  Debye 
relaxation  on  this  solid.  Only  a  few,  very  slight,  resonance  effects  are  seen  in  the 
infrared.  Similarly  temperature  has  little  effect  except  in  the  neighborhood  of  the  6n(A)’ 
label.  Variations  in  temperature  cause  the  shoulder  in  “n”  point  to  by  the  arrow  to 
translate  in  frequency. 

In  the  case  of  the  imaginary  component  ‘k’  (losses),  the  opposite  extreme  holds.  The 
value  changes  rapidly  for  both  frequency  and  temperature  variations.  The  THz  band  is 
near  the  minimum  of  the  curve  and  the  corresponding  values  for  the  complex  index  can 
easily  by  an  order  of  magnitude  or  two. 


Figure  4-6  Complex  index  of  refraction  of  ice  at  0°C 

The  corresponding  reflectivity  and  emissivity  curves  are  shown  in  Figure  4-7.  Once 
again  the  reflectivity  and  emisisvity  curves  are  just  beginning  to  leave  a  flat  region  as  you 
enter  the  THz  band  from  lower  frequencies.  This  is  due  to  the  impact  of  the  tails  of 
infrared  resonance  lines  on  the  lowetr  frequency  behavior  with  the  same  implications 
cited  before. 

Details  of  snow  are  similar  and  omitted  for  brevity. 
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Emissivity  as  Fraction  Power  Reflectivity  as  Fraction 
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(A)  Reflectivity  of  Ice  at  0°C 


Emissivity 
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(B)  Emissivity  of  Ice  at  0°C 


Figure  4-7  Reflectivity  and  Emissivity  of  ice  at  0°C 
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4.2.2  Incorporation  of  human  tissues 

Water  behavior  dominates  the  behavior  of  human  tissue.  The  matrix  effect  of  the  tissue 
structure  is  second  order  in  impact.  Biological  studies  utilize  these  second  order  effects 
for  diagnostic  purposes.  These  techniques  work  because  the  upper  layers  of  body  tissue 
are  fairly  transparent  to  THz  radiation  and  the  low  photon  energy  at  THz  frequencies 
presents  little  ionizing  radiation  hazards.  Because  the  upper  tissue  layers  are  fairly  dry, 
the  THz  radiation  penetrates  through  the  upper  layers.  Once  the  THz  radiation  reaches 
the  lower  wet  region,  it  is  rapidly  attenuated.  The  net  result  is  a  weak,  second  order 
signal  from  the  dry  upper  tissue  layers  superimposed  on  a  moderate  “first  surface” 
reflection  from  the  deeper  water  layer.  The  transmitted  signal  is  rapidly  attenuated  in  the 
wet  region.  As  is  typical  for  the  THz  regime,  emission  is  nil.  The  net  result  from  a 
diagnostic  standpoint  is  a  weak  signal  correlated  with  the  upper  layer  tissue  structure 
against  a  background  of  THz  reflection  noise.  The  resulting  SNR  is  low  but  yields 
helpful  information  regarding  the  upper  tissues  after  enough  image  processing. 

For  the  perspective  of  THz  imaging,  the  second  order  affects  of  the  upper  tissue  layers  is 
not  significant.  To  the  first  order  the  system  may  be  modeled  as  a  water  barrier 
supported  by  a  tissue  matrix.  The  matrix  effects  of  the  tissue  are  trivial.  Thus  leaves  just 
the  water  molded  into  compound  3-D  curves.  Generally  the  curvature  occurs  over  many 
wavelengths  (the  exception  being  hair)  so  that  classic  models  regarding  impact  of  a  plane 
wave  on  an  air  to  water  interface  may  be  used.  The  details  of  the  water  behavior  are 
incorporated  into  the  previous  section. 

Given  the  delays  in  the  receipt  of  the  ONR  DURIP  equipment,  the  dominate  nature  of 
water  in  the  tissue  and  the  minor  impact  of  the  body  matrix  on  the  first  order  results,  it 
was  decided  to  forgo  the  live  tissue  sample  at  this  time. 

4.2.3  Jefferson  Laboratory  validation 

Subsequent  to  the  original  proposal  for  the  Center  for  Advanced  Sensor  -  Year  2,  the 
funding  available  was  cut  by  a  significant  percentage.  This  was  under  taken  at 
congressional  request.  This  in  turn  necessitated  the  cutting  of  some  expenses  associated 
with  this  project.  After  this  situation  became  known,  the  University  of  Memphis  was 
notified  of  the  receipt  of  an  ONR  based  DURIP  for  $188,602.00  in  THz  related 
equipment.  As  a  result  it  was  decided  -  in  consultation  with  the  ARL  contract  monitor  - 
to  terminate  this  activity  to  rebalance  the  budget.  Unfortunately  the  ONR  equipment  was 
greatly  delayed.  Independent  validation  was  obtained  from  other  sources  including 
collaborators  and  through  the  use  of  borrowed  time  on  some  loaned  Army  THz 
equipment.  Though  the  additional  data  generated  was  limited,  it  combined  with  historical 
data  allowed  the  completion  of  this  portion  of  the  contract.  It  is  anticipated  that  the  ONR 
equipment  will  be  received  soon  and  additional  new  data  generated  at  that  time. 

4.2.4  Conclusions  regarding  water  and  tissue 

It  was  determined  that  the  water  dominates  the  behavior  of  human  tissue.  Utilizing 
classic  plane  wave  based  reflection,  absorption  and  transmission  coefficients  was 
recommended  for  the  modeling  of  human  tissue.  The  upper  layers  contribute  little  to  the 
description  that  justifies  their  incorporation.  They  should  be  treated  as  transparent 
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followed  by  a  water  surface  a  few  millimeters  below  the  skin.  The  modeling  of  water 
depends  on  its’  state  -  liquid,  ice,  snow.  Models  describing  each  were  implemented.  Ice 
and  snow  are  of  primary  interest  for  scene  backgrounds.  The  water  model  is  important 
for  a  variety  of  applications  where  the  water  characteristics  dominate  its’  supporting 
matrix.  In  the  interest  of  trading  brevity  against  completeness,  it  was  decided  that  human 
tissue  was  best  modeled  as  a  matrix  containing  water.  The  water  starts  a  few  millimeters 
below  the  skin  surface  and  exhibits  standard  behavior  at  the  ambient  temperature.  The 
second  order  matrix  effects  of  the  tissue  are  slight  but  measurable  and  are  best  left  to 
biological  applications. 

4.3  Develop  Techniques  to  Enhance  THz  Target  Performance 
in  Imaging  Applications  through  unique  procedures  for 
improved  contrast ,  noise  reduction ,  chemical  signatures , 
etc. 

This  research  advances  the  development  of  processing  techniques  that  take  advantage  of 
unique  target  properties  to  improve  THz  imaging  system  performance.  As  such  -  this 
portion  of  the  project  was  implemented  after  the  completion  of  milestones  one  and  two. 
This  allowed  the  incorporation  of  target  results  into  recommendations  regarding  the  THz 
image  processing.  Of  principle  consideration  here  is  the  interaction  of  the  eye  CTF 
response  with  target  characteristics  related  to  noise  and  resolution.  Emphasis  was  placed 
on  techniques  designated  as  most  promising  by  ARL  and  NVESD. 

To  accomplish  this,  three  specific  actions  were  developed: 

4.3.1  The  processing  techniques  that  leverage  eye  CTF  response  to 
improve  THz  imaging  system  contrast  were  investigated. 

A  number  of  processing  techniques,  that  leverage  the  CTF  of  the  eye  to  improve  THz 
imaging  response,  were  considered.  Of  these,  the  behavior  of  clutter  due  to  water 
distributions  in  objects  such  as  fabrics,  was  emphasized.  In  the  process  a  new 
interpretation  for  clutter  in  spatial  systems  was  developed.  It  was  applicable  to  TIFT  IB 
modeling  and  has  been  incorporated  into  proposal  work  towards  TIFT2.  To  understand 
this  result,  consider  first  Figure  4-8.  This  is  a  plot  of  the  Contrast  Transfer  Function 
(CTF)  of  the  human  eye.  This  establishes  the  minimum  eye  detection  versus  spatial 
frequency.  The  vertical  axis  is  scaled  in  contrast  values  where  a  value  of  one  indicates 
that  twice  the  background  average  luminescence  (signal  average  plus  noise)  equals  the 
peak  to  peak  value  of  the  signal.  At  a  CTF  value  of  one  all  the  average  value  is  equal  to 
the  signal.  As  will  be  seen  in  a  moment,  this  is  important  in  the  description  of  THz 
clutter  by  1/f  noise. 

In  addition  to  the  eye  CTF  there  is  also  an  eye  MTF.  This  establishes  the  gain  associated 
with  the  eye.  Eye  MTF  is  a  function  of  many  parameters  such  as  average  luminescence. 

A  example  response  curve  is  shown  in  Figure  4-9  below. 

In  the  previous  section  (4.2)  the  predominance  of  water  in  determining  behaviors  at  THz 
frequencies  was  established.  In  addition  to  having  a  very  strong  effect,  it  also  is 
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ubiquitous  -  not  only  in  the  atmosphere  but  also  in  many  “solid”  materials  such  as 
plastics,  cloth,  etc.  Thus  the  imaging  behavior  of  these  materials  is  dominated  by  the 
presence  of  water  and  the  hydrological  history  of  the  material  that  determines  its’  water 
content.  One  of  the  distinguishing  characteristics  of  metals  is  that  they  effectively 
exclude  water.  Thus  THz  images  may  be  thought  of  as  metal  on  water  background 
imaging.  This  provides  contrast  not  only  for  the  determination  of  high  metallic  content 
objects  but  provides  relative  contrasting  of  the  water  containing  materials.  Because  the 
water  content  of  a  given  object  is  a  function  of  its’  history,  absolute  contrast  is  not  an 
effective  marker.  However,  for  imaging  applications  it  is  the  relative  contrast  that 
maters.  In  fact  the  eye  generally  responds  to  relative  values  -  not  absolute.  Thus  even 
similar  materials  that  have  undergone  a  different  hydrological  history  may  be  imaged 
against  each  other. 

Within  this  context,  the  residual  none-equilibria  of  water  concentrations  in  materials 
represent  a  potential  spatial  noise  that  may  be  interpreted  as  clutter.  Typical  material 
time  constants  that  describe  the  absorption  and  transport  of  water  may  be  on  the  order  of 
hours  or  days.  As  such,  many  materials  never  really  reach  equilibrium  with  their 
surroundings.  This  may  be  an  advantage  for  relative  contrasting  in  images  but  also 
means  that  a  single  object  rarely  achieves  water  uniformity.  The  residual  “clumping”  of 
water  concentrations  will  produce  similar  clumps  in  THz  reflectivity,  absorption  and 
transmission  that  may  be  a  significant  source  low  frequency  spatial  structure. 

Considering  Figures  4-8  and  4-9,  it  can  be  seen  how  sensitive  the  eye  is  to  low  spatial 
frequency  noise  (clumps).  Though  there  is  much  discussion  regarding  the  exact 
definition  of  clutter,  there  is  general  agreement  it  is  a  low  spatial  frequency  effect.  For 
the  purposes  here,  clutter  can  be  divided  into  fixed  pattern  noise  (irrespective  of  source), 
1/f  noise,  the  interaction  of  these  within  the  non-linear  characteristics  of  an  imaging 
system,  and  other  unwanted  signals.  Again  the  intention  here  is  not  so  much  to  solve  the 
debate  as  to  narrow  the  discussion  to  terms  that  are  relevant  here.  Fixed  pattern  noise  are 
spatially  fixed,  noise  pixels  due  to  poorly  performing  elements  in  focal  plane  arrays, 
specular  glints  in  a  fixed  field  of  view,  etc.  Spatial  flicker  (1/f)  noise  is  the  ubiquitous 
results  of  minor  non-linearity  present  in  all  physical  phenomena  that  has  the  unfortunate 
property  of  being  invariant  under  linear  superposition.  This  means  that  post  detection, 
image  processing  whether  analog  or  digital  tends  to  reduce  image  noise  but  has  no  effect 
on  the  1/f  noise  leaving  it  as  a  primary  noise  component.  This  is  because  most  post 
detection  processing  is  the  repeated  application  of  techniques  based  on  linear 
superposition  which  does  not  affect  1/f  noise.  Classic  non-linear  techniques  such  as  lock- 
in-amp  chopping  help  but  residual  1/f  behavior  in  phase  still  produce  annoying  phase- 
lock  drift.  The  ultimate  treatment  of  1/f  noise  remains  an  area  of  significant  research. 

From  Figures  4-8  and  4-9,  the  spatial  frequency  region  from  a  few  milli-cycles/  milli- 
radian  to  approximately  one  cycle/milli-radian  is  identified  as  the  principle  range  of 
interest  for  1/f  noise  analysis.  Within  this  range  the  eye  has  both  significant  gain  and 
insensitivity  to  other  noises  to  interpret  the  1/f  spectrum  as  image  clumps.  Such 
unwanted,  non-descript  objects  are  a  major  distracter  that  a  viewer  would  typically  call 
clutter.  Behavior  consistent  with  this  has  been  seen  in  the  few  available  THz  images  and 
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Relative  Values 


needs  to  be  characterized  with  regards  its’  impact  on  potential  THz  imaging.  Before 
proceeding  too  far  in  this  direction,  the  cause  of  the  1/f  noise  for  THz  applications  of 
immediate  interest  needs  to  considered.  This  is  developed  in  section  4.3.2. 
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Figure  4-8  Eye  CTF 
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Figure  4-9  Eye  MTF 

4.3.2  The  development  of  processing  techniques  to  improve  THz 
imaging  system  response  with  regards  to  temporal  and 
spatial  noise  performance. 

As  an  example  of  processing  techniques  to  improve  THz  imaging  response  to  temporal 
and  spatial  noise,  the  development  of  clutter  in  moist  fabrics  was  considered.  Assuming 
the  use  of  heterodyne  detection  techniques,  temporal  noise  was  modeled  by  traditional 
techniques  such  as  system  noise  temperature  and  found  to  be  trivial  for  this  application. 
Traditional  spatial  noise  processing  techniques  reduce  the  spatial  frequency  base  noise  to 
acceptable  levels  but  leave  a  difficult  spatial  noise  residue  usually  called  clutter. 
Modeling  this  as  a  1/f  or  flicker  noise  is  demonstrated  to  be  useful  and  provides  insight  to 
potential  new  techniques  for  treatment.  It  is  more  than  coincidence  that  both  clutter  and 
spatial  1/f  noise  are  ubiquitous.  It  has  been  shown  that  they  stem  from  fundamental 
physical  requirements.  In  addition  they  both  cannot  be  removed  by  conventional 
processing  that  is  expressible  as  additions  and  multiplications.  To  be  a  little  less  abstract 
and  hopefully  less  esoteric,  the  special  case  of  moist  fabrics  was  considered.  The  results 
are  more  general  than  this,  which  was  chosen  for  illustration  purposes  -  making  the  issue 
a  little  less  murky. 

For  fabric  used  in  clothing  there  is  significant  transport  of  water  along  and  through  the 
material.  The  fabric  water  content  is  the  result  of  an  equilibria  between  various  sources 
such  as  human  perspiration,  atmospheric  relative  humidity,  spills,  etc.  and  sinks  to  the 
atmosphere  or  other,  dryer  objects.  The  fabric  acts  as  a  conveyer  to  transport  water  from 
high  concentration  to  low  and  ultimately  dissipates  it  to  the  air.  Fabrics  which  do  not  do 
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this  are  judged  to  be  “uncomfortable”  and  are  simply  not  worn.  The  processes  involved 
in  this  transport  of  water  result  in  a  second  order  “clumping”  of  water  concentrations. 
After  processing  through  the  eye  MTF  and  CTF  this  results  in  a  1/f  spatial  noise  behavior 
that  is  interpreted  as  clutter  in  THz  images.  To  get  a  feel  for  this  consider  figure  4-10. 

Water  Transport  along  Fabric 


Net  Evaporation 


Figure  4-10  Water  Equilibrium  in  Fabric 


In  this  figure  a  small  swatch  of  clothing  fabric  is  represented.  Generally  the  fabric 
thickness  is  small  compared  to  the  wavelengths  of  interest  in  the  THz  regime.  As  a  result 
the  problem  reduces  to  a  two  dimensional  description.  In  the  figure  the  water  transport  is 
described  for  a  small,  differentiable,  surface  element  -  dS.  Into  this  surface  element  you 
have  the  net  external  sources  -  perspiration,  etc.  and  losses  to  external  sinks,  such  as 
evaporation.  In  addition  there  is  transport  of  the  water  along  the  fabric  both  into  and  out 
of  the  differential  element.  To  understand  this  from  a  physics  stand  point  the  problem  can 
be  reduced  to  a  one  dimensional  analogue. 
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Where  the  range  of  2  dx  is  indicative  of  a  Laplacian 


Or  more  formally  it  can  be  described  in  three  dimensions  as: 


Q=  dw{x,y,t) 
8t 


V.(-V(K  W(x,y,t)))+C  w(x,y,t) 
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Which  becomes: 


V2W(x,y)  =  rW{x,y) 

WhereV  •{VW)  =  V1W  and  y  =  ~ 


Now,  the  resulting  equation V2W(x,y)=  y  W(x,y),  called  the  Helmholtz  equation,  is 
generic  description  of  water  transport  in  a  fabric.  The  normal  solution  technique  for  this 
equation  is  separation  of  variables  into  a  convenient  coordinate  system.  Special  cases  of 
this  equation  include  a  homogeneous  form  (7=0)  sometimes  called  the  spatial  component 
of  the  diffusion,  heat  or  telegrapher’s  equation.  Equations  of  this  form  describe  centro- 
symmetric  physical  phenomena  -  phenomena  where  there  is  a  restorative  ’’force”  to  an 
equilibrium.  This  is  a  math  way  of  saying  that  the  water  flows  from  the  surrounding 
regions  of  high  concentration  to  fill  regions  of  low  concentration  -  be  it  slowly. 

There  exist  mathematical  proofs  that  show  that  centro- symmetric  phenomena  are  of 
necessity  energy  non-linear  and  homogenous.  A  physical  phenomena  described  by  the 
Helmholtz  equation  is  energy  non-linear  and  homogenous.  It  is  known  that  a  chaotic 
system  exhibiting  nonlinearity  plus  homogeneity  will  generate  1/f  (flicker)  noise.  At  low 
signal  strength  (noise  levels)  the  system  is  dominated  by  chaotic  behavior  giving  rise  to 
noise  in  the  spatial  distribution  of  water  exhibiting  a  1/f  power  spectral  distribution.  The 
spatial  appearance  of  such  a  noise  distribution  is  spatially  distributed,  weak  strength, 
“lumps  and  clumps”  superimposed  over  the  principle  water  distribution 

Because  of  the  nature  of  1/f  noise,  subsequent  processing  of  the  image  cannot  remove  it. 
This  noise  is  invariant  under  the  operations  of  addition  and  multiplication.  In  fact 
subsequent  processing  to  remove  spatial  noise  tends  to  emphasize  it  as  the  general 
background  noise  falls.  This  is  particularly  true  for  extremely  weak  signals.  Since 
almost  all  post  detection,  processing  algorithms  obey  linear  superposition,  1/f  noise 
generated  clutter  is  ubiquitous  in  images.  Be  aware  that  this  does  not  apply  to  clutter  as 
unwanted  signal  information  -  only  to  the  subclass  generated  by  1/f  noise. 

In  figure  4-11  is  the  magnitude  of  the  spectrum  of  1/f  noise  plotted  versus  the  spatial 
frequencies  in  cycles  /  milliradian.  It  should  be  noted  that  the  vertical  scale  in  this  figure 
is  milli  foot-Lamberts  as  a  consequence  of  the  spectral  density  nature  of  the  function. 

Milli  Foot-Lamberts  =  Foot-Lamberts  /  (cycles/milliradian) 

The  actual  luminescence  represented  by  this  figure  is  100  Foot-Lamberts  -  a  high  but  not 
unreasonable  value.  This  noise  is  subsequently  processed  by  the  eye  MTF  and  CTF 
giving  a  slightly  different  shape  documented  in  Figure  4-12.  The  eye’s  rising 
insensitivity  at  very  low  spatial  frequencies  documented  in  Figure  4-8  has  the 
mathematically  convenient  impact  of  concealing  the  singularity  at  zero.  Other  wise  it  has 


53 


little  impact  at  low  frequencies.  However,  the  shape  of  the  curve  at  mid  and  high  spatial 
frequencies  is  changed  by  the  cumulative  impact  of  eye  MTF  and  CTF.  The  resulting 
curve  is  rich  in  low  frequency  components  and  with  the  inclusion  of  the  proper  phase 
relationship  produces  spatial  clumps  (clutter)  in  the  spatial  domain.  Details  of  this  are 
too  involved  for  this  document  and  are  reproduced  under  separate  cover. 
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o  Spectrum  with  Eye  Response 
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Figure  4-12  1/f  Noise  Trimmed  and  Attentuated  by  Eye  Response 

4.3.3  The  development  of  processing  techniques  to  improve  THz 
imaging  system  response  with  regards  to  unique  chemical 
signatures  of  preferred  target  material  was  modeled. 

Compressive  sensing  techniques  were  applied  to  chemical  signatures.  The  use  of  high 
energy  materials  at  a  university  is  complicated.  Therefore  alternative  chemical  signatures 
were  considered  to  demonstrate  the  potential  of  the  application  of  this  technology. 
Generic  spectroscopy  was  also  avoided  as  the  results  of  the  data  overload  associated  with 
combining  detailed  spectrums  at  every  point  in  a  two  dimensional  image.  Such  large 
scale  data  fusion  is  the  emphasis  of  other  research  activities  and  is  not  considered  here. 
The  emphasis  here  was  the  collection  of  selective  temporal  frequency  data  on  a  two 
dimensional  image.  The  intent  was  to  produce  chemical  signature  identification  without 
collecting  complete  spectrum  at  each  point  and  attempting  to  integrate  to  the  image. 

The  result  of  the  consideration  of  numerous  competing  techniques  [Wang, 2002]  was  the 
selection  of  single  frequency  detectors  distributed  at  seven  specific  locations  across  the 
visible  band.  The  actual  approach  is  a  modified  version  of  one  developed  by  the 
compressive  sensing  community  [Baraniuk, 2007, Unpublished]  [Baron, unpublished] 
[Brown, 2000] [Candes, 2006]Donoho, 2006] [Gastpar, 2003] [Vetterli, 2002]  to  reduce  the 
number  of  detectors  required  to  sense  a  given  signal.  In  this  case,  it  is  envisioned  that 
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only  selected  frequency  information  is  required  for  the  pixel  locations  contained  in  a  two 
dimensional  image.  From  this  an  effective  chemical  signature  sort  can  be  accomplished 
without  resorting  to  the  collection  of  computer  clogging  volumes  of  spectroscopic  data. 
As  a  demonstration,  consider  the  seven  line  (Bandwidth- 1.5 A)  filtered  detectors 
represented  in  Figure  4-13.  Though  originally  not  random,  the  placement  of  these 
frequency  sample  points  can  be  described  by  a  Poisson  statistical  density  with  a  “p”  of 
0.023  and  a  k=7.  The  resulting  placement  is  described  by  a  random  variable  and  is 
sufficiently  un-correlated  to  meet  the  requirements  for  compressive  sensing. 

As  a  test  of  this  sensing  modalities  ability,  this  instrument  was  used  to  identify  seven 
elements  from  a  complex  organics  mixture  containing  macromolecules  of  biological 
interest.  A  double-blind  series  of  experiments  were  undertaken  to  validate  this  sensors 
detection  results  against  standard  techniques  such  as  GFAA  (Graphite  Furnace  Atomic 
Absorption).  The  result  was  the  determination  of  the  seven  elements  of  interest  in  the 
presence  of  extensive  spectral  noise  from  the  organic  macromolecules.  Details  of  this 
work  have  been  document  in  other  publications  and  will  not  be  repeated  here.  Of 
significance  here  is  the  demonstration  of  the  effectiveness  of  the  application  of  the 
compression  sensing  techniques  to  the  determination  of  specific  materials  in  the  presence 
of  significant  spectral  noise  with  out  resorting  to  heroic  data  fusion  efforts. 

Thus  the  application  of  compressive  sensing  techniques  to  the  identification  of  the 
chemical  signatures  of  materials  of  interest  designates  only  a  few  frequencies  of  interest 
that  must  be  collected  in  a  two  dimensional  image.  This  along  with  the  subsequent  post 
processing  [Tropp,2003,2005]should  allow  a  more  effective  material  ID. 
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Figure  4-13  Poisson  Distribution  of  Compressive  Frequency  Sensors 
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4.3.4  Conclusion 

Publications  and  proposals  in  addition  to  those  cited  for  year  one  are  in  process  as  a  result 
of  this  contract.  Follow-on  research  has  been  suggested  by  these  results.  Other  sponsors 
have  expressed  interest  in  funding  additional  work.  New  interactions  with  a  number  of 
military  sites  and  universities  have  been  developed.  Dr.  Griffin  would  like  to  thank  the 
sponsor  for  this  opportunity. 

Modeling  of  metals  was  affected  by  modification  of  the  Boltzmann  relaxation  assumption 
and  other  effects  to  accommodate  the  transitional  nature  of  the  THz  regime. 

Modeling  of  plastics  was  affected  by  recognition  of  the  dominate  role  of  water  in  the 
plastic  behavior,  use  of  the  modified  water  model  and  use  of  empirical  water  uptake  data 
such  as  measured  time  constants. 

Water  was  modeled  utilizing  extensively  corrected  variations  on  Debye’s  theory  and 
absorption  lines.  Ice  and  snow  were  similarly  modeled. 

Human  tissue  was  modeled  by  recognition  of  the  dominant  effect  of  the  included  water 
and  ignoring  the  second  order  tissue  structure  effects. 

Techniques  to  enhance  THz  target  performance  in  imaging  applications  through  unique 
procedures  gave  a  number  of  promising  results.  In  particular,  a  re-interpretation  of 
clutter  as  it  relates  to  THz  imaging  and  water  distribution  suggests  a  relationship  to 
spatial  1/f  noise.  A  review  of  fundamental  physical  behaviors  suggests  a  second  order, 
noise-like  water  distribution  described  by  the  Helmholtz  equation.  Application  of  this  to 
say  people,  suggests  the  convenient  coordinates  for  separating  the  Helmholtz  equation  is 
Elliptical-Cylindrical  coordinates.  However,  given  the  restricted  field-of-view  of  THz 
imaging  systems,  it  does  not  make  sense  to  model  the  water  distribution  but  to  add  ad  hoc 
a  spatial  1/f  noise  of  the  proper  strength. 

Finally,  the  application  of  compressive  sensing  techniques  to  spectral  identification  of 
materials  in  an  image  is  promising. 

Technical  details  of  the  results  are  published  as  itemized  in  section  4.4. 

4.3.5  Future  work 

Performance  modeling  of  FPA  issues  such  as  detector  pitch,  sensitive  area,  noise 
performance  including  post  year  one  developments.  Performance  modeling  of 
subsequent  electronic  processing  will  be  considered. 

Develop  performance  model  for  simulated  images  in  the  THz  range  including  system 
resolution,  system  uniformity,  environmental  propagation, 
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Develop  and  revise  performance  model  utilizing  human  studies  on  simulated  images  for 
comparison  and  validation. 

Publications  and  additional  proposals  have  been  made  and  received  as  a  result  of  this 
contract.  Follow-on  research  has  been  suggested  by  these  results.  New  interactions  with 
ARL  at  White  Sands,  NVESD  at  Ft.  Belvoir,  RTTC  at  Redstone  and  ONR  have  been 
developed.  Dr.  Griffin  would  like  to  thank  the  sponsor  for  this  opportunity. 
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5.0  Display  Characterization  and  Digital  Signal  Processing 
Development  (Dr.  Aaron  Robinson) 

This  report  summarizes  the  advances  made  during  the  time  period  extending  June  1,  2006 
to  May  31,  2007.  It  describes  the  research  efforts  undertaken  in  support  of  the  stated 
milestones  and  the  Center  for  Advanced  Sensors  (CAS)  added  focus  on  Intelligence, 
Surveillance  and  Reconnaissance  (ISR)  sensors. 

5.1  Display  Characterization 

The  characterization  of  one  of  the  Barco  LCD  monitors  at  the  Center  for  Advanced 
Sensors  was  completed  during  this  quarter.  The  LCD  characterization  parameters  were 
obtained  using  the  MATLAB  program  and  image  set  provided  by  Jeff  Olson  at  the 
Army’s  Night  Vision  Electronic  Sensor  Directorate  (NVESD).  The  same  program  and 
imagery  were  used  to  characterize  the  CAS’s  Barco  CRT  monitors.  The  results  are 
recorded  in  Tables  5- land  5-2. 


Comment  [tl] :  Carl,  are  these  dates 
correct?  TEW 


Recall  that  for  CRT  monitors,  the  output  luminance  for  a  given  input  grayscale  value  can 
be  determined  with  the  following  equation: 


Kut  ~  Anin  +  A 


'  groyscale^Y 

v  grayscale^  , 


(5-1) 


The  MATLAB  program  and  imagery  were  applied  to  obtain  the  values  in  Equation  5-1 
for  the  LCD  monitor  characterization.  Table  5-1  gives  the  best-fit  values  obtained  for  the 
five  measured  characterization  parameters  for  one  of  the  LCDs.  _ 


LUMINANCE 

EQUATION 

INPUT 

PARAMETERS 

OUTPUT 

A 

VO 

Gamma 

L0 

tau 

RMS 

Error 

28.21390693 

130.066838 

3.4744934 

0.35658092 

0.0357121 

0.0986244 

Table  5-1  Barco  LCD  Monitor  Characterization  Parameters 


60 


Table  5-2  gives  the  best  fit  values  obtained  for  the  five  measured  characterization 
parameters  for  one  of  the  Barco  CRT’s.  This  data  was  included  in  the  report  from  the 
previous  quarter.  It  is  reproduced  here  to  aid  in  comparison. _ 


LUMINANCE 

EQUATION 

INPUT 

PARAMETERS 

OUTPUT 

A 

VO 

Gamma 

L0 

Tau 

RMS  Error 

28.01597886 

130.0741935 

2.390327003 

0.140059429 

0.379037758 

0.132559958 

Table  5-2  Barco  CRT  Monitor  Characterization  Parameters 

As  expected,  the  luminance  behaviors  of  the  LCD  and  CRT  monitors  were  distinctly 
different  from  exhibited  by  the  well  studied  CRT  monitors.  This  can  be  attributed  to  a 
number  of  factors.  These  factors  include  the  differences  in  point  spread  functions  for 
both  monitors,  the  fact  that  brightness  and  contrast  cannot  be  individually  controlled  on 
an  LCD  and  the  effects  of  the  narrowed  dynamic  range  of  the  LCD.  These  behaviors  are 
reflected  in  Tables  5-1  and  5-2.  Note  that  the  minimum  luminance  values  (LO)  for  LCDs 
is  higher  than  that  of  the  CRTs.  The  gamma  and  tau  parameter  values  are  also 
significantly  different.  The  above  issues  would  seem  to  imply  that  human  performance 
characteristics  on  CRT  monitors  would  be  different  than  on  LCD  monitors.  In  order  to 
effectively  quantify  those  differences  in  ISR  tasks  and  to  further  aid  in  the  LCD 
characterizations,  some  perception  experiments  were  planned. 

5.2  Display  Characterization  Perception  Experiment  Development 

Due  to  concerns  of  obtaining  real  world  imagery  representative  of  current  ISR  tasks, 
alternative  perception  experiment  plans  were  developed  based  on  character  and  shape 
identification.  These  applications  were  chosen  because  they  closely  resemble  the  ISR 
task  of  identifying  a  specific  object  in  a  cluttered  scene.  This  is  similar  to  the  security 
and  surveillance  tasks  of  identifying  a  specific  face  in  a  crowd,  identifying  a  specific  car 
on  a  highway  or  determining  if  a  person  has  a  particular  type  of  weapon. 

This  section  describes  the  forced-choice  experiments  developed  using  characters  and 
simple  shapes.  The  subjects  are  shown  one  object  from  a  known  set  and  must  identify 
that  object.  All  of  the  images  are  computer-generated  and  therefore  the  need  for  a 
physical  data  collection  is  eliminated.  A  probability  of  correct  ID  (PID)  distribution 
results  from  varying  blur,  object  size,  and  noise. 

In  the  event  that  real  imagery  can  be  obtained,  new  perception  experiments  will  be 
designed  to  determine  their  effects. 

5.2.1  Character  Experiment 

There  were  two  character  sets  selected  for  use  in  the  experiment.  The  sets  are  2-3-5-E  and 
6-9-a-P.  The  characters  are  based  on  the  Rochester  Institute  of  Technology  (RIT) 
character  set;  the  RIT  characters  are  equally  recognizable  at  the  limits  of  perception.  The 
characters  used  here  are  thinner  than  the  RIT  characters  to  make  the  letters  harder  to 
identify  when  blurred.  Contrast  of  the  unblurred  characters  is  0.2. 


61 


Figure  5-1  illustrates  character  presentation  to  test  subjects.  One  randomly  selected 
character  from  the  set  2-3-5-E  is  presented  on  the  left,  and  one  of  the  characters  6-9-a-P 
is  presented  on  the  right.  The  subject  selects  the  appropriate  button  on  the  left  to  match 
the  left  character  and  the  appropriate  button  on  the  right  to  match  the  right  character. 
Clicking  on  the  “next”  button  brings  up  the  next  two  characters.  The  set  2,3, 5, E  is  always 
presented  on  the  left  and  the  set  6,9,a,P  on  the  right.  A  total  of  96  character  pairs  are 
presented  to  each  subject.  Eight  character  pairs  are  presented  at  each  of  four  ranges  and 
three  noise  levels.  Presentation  of  range,  noise,  and  character  are  random. 


E3IDEH1  mm 

Pause  Next 


Figure  5-1  Figure  illustrates  character  presentation  during  subject  testing.  The  subject  sees  one 
character  from  each  set.  The  subject  selects  the  appropriate  button  beneath  the  character. 


Range 

(km) 

1 

2 

1  3 

Blur 

(see  Eq.  17) 

Noise 

(modulation) 

20 

0 

15 

0 

1  11.7 

|  0 

1  4 

1  10 

0 

1  1 

!  10 

|  0.025 

i  2 

10 

0.025 

|  3 

8.3 

|  6.025 

4 

7.5 

0.025 

1 

2 

10 

0.05 

7.5 

0.05 

3 

6.7 

0.05 

4 

7.5 

0.05 

Table  5-3  Blur  and  noise  applied  at  each  range  in  character  experiment. 
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Range 

(km) 

Noise 

modulation 

Color 

display 

Mono. 

display 

1 

0 

0.98 

0.97 

2 

0 

0.42 

0.83 

3 

0 

0.12 

0.47 

4 

0 

0.03 

0.42 

1 

0.025 

1 

i 

2 

0.025 

0.69 

0.69 

3 

0.025 

0.5 

0.5 

4 

0.025 

0.14 

0.14 

1 

0.05 

1 

1 

2 

0.05 

0.75 

0.75 

3 

0.05 

0.11 

0.11 

4 

0.05 

0.08 

0.08 

Table  5-4  Measured  PID  for  character  experiment 


5.2.2  Shape  Experiment 

The  shapes  presented  in  this  experiment  include  a  circle,  a  rotated-square  with  rounded 
edges,  a  hexagon,  and  an  octagon.  The  hexagon  and  octagon  have  rotated  and  non-rotated 
versions.  Each  shape  has  an  area  of  24,450  ±  50  meters.  The  intrinsic  contrast  is  0.08. 

Figure  5-2  illustrates  the  perception  experiment  interface  used  to  accomplish  the  shape 
presentation.  Just  as  in  the  character  experiment,  each  presentation  requires  the  subject  to 
make  two,  independent  discriminations.  For  the  shape  on  the  left,  the  subject  sees  two 
options  at  the  bottom  of  the  display;  the  subject  also  sees  two  options  for  the  shape  on  the 
right.  The  subject  clicks  on  the  appropriate  shape  left  and  right  and  then  clicks  on  the 
“next”  button  to  bring  up  another  set  of  shapes. 


Figure  5-2:  Shows  the  interface  used  to  present  the  square,  hexagon,  octagon,  and  circle  used  in 
shape  experiment 
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The  experiment  consists  of  100  circle-hexagon  and  100  circle-octagon  discriminations 
with  rotated  and  non-rotated  shapes  in  equal  number.  There  are  200  hexagon-octagon 
discriminations  with  all  combinations  of  rotated  and  non-rotated  shapes  in  equal  number. 
There  are  50  discriminations  of  rotated-square  versus  rotated-hexagon.  The 
discriminations  are  divided  equally  among  five  ranges.  Discriminations  at  each  range 
include  20  circle-hexagons,  20  circle-octagons,  40  hexagon-octagons,  and  10  rotated- 
square  versus  rotated-hexagon. 


Range 

(km) 

Blur  1 
(see  Eq.  17) 

Blur  2 
(see  Eq.  17) 

2 

10 

40 

2.5 

12.5 

50 

3 

15 

60 

3.5 

18.5 

73.8 

4 

20 

80 

Table  5-5  Range  and  blur  combinations  for  shape  experiment. 


Range 

(km) 

Circle 

Octagon 

Circle 

Hexagon 

Octagon 

Hexagon 

Square 

Hexagon 

2 

0.48 

0.84 

0.83 

1 

2.5 

0.1 

0.76 

0.59 

0.93 

3 

0.16 

0.32 

0.29 

0.93 

3.5 

0.16 

0.16 

0.12 

0.8 

4 

0.1 

0.22 

0.0 

0.53 

Table  5-6  Measured  PID  for  shape  experiment. 
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Figure  5-3  The  figure  shows  PID  (ordinate)  and  DRI  model  metric  (abscissa)  for  discriminating 
circle  and  octagon.  The  diamonds  are  data  and  lines  model  predictions.  The  dashed  line  shows 
prediction  using  Q50  giving  best  least-squares  fit.  The  solid  line  shows  prediction  using  a  smaller 
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Circle-octagon 

o  Data  —  Model  1  - Model  2 

oil 


9  0.6 
o-  0.4 
0.2 
0 


0.005  0.01  0.015 

Metric  cycles  /  meter 


Figure  5-4  The  figure  shows  PID  (ordinate)  and  DRI  model  metric  (abscissa)  for  discriminating  circle 
and  hexagon.  The  diamonds  are  data  and  lines  model  predictions.  The  dashed  line  shows  prediction  using 
Q50  giving  best  least-squares  fit.  The  solid  line  shows  prediction  using  a  smaller  Q50. 
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Figure  5-5  The  figure  shows  PID  (ordinate)  and  DRI  model  metric  (abscissa)  for  discriminating  circle 
and  hexagon.  The  diamonds  are  data  and  lines  model  predictions.  The  dashed  line  shows  prediction  using 
Q50  giving  best  least-squares  fit.  The  solid  line  shows  prediction  using  a  smaller  Q50. 
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Figure  5-6  The  figure  shows  PID  (ordinate)  and  DRI  model  metric  (abscissa)  for  discriminating 
hexagon  and  octagon.  The  diamonds  are  data  and  lines  model  predictions.  The  dashed  line  shows 
prediction  using  Q50  giving  best  least-squares  fit.  The  solid  line  shows  prediction  using  a  smaller  Q50. 


65 


Square-hexagon 
o  Data  —  Model  1  - Model  2 

1  n 
0.8 
g  0.6 
Q-  0.4 
0.2 
0 

0.005  0.01  0.015 

Metric  cycles  /  meter 


Figure  5-7  The  figure  shows  PID  (ordinate)  and  DRI  model  metric  (abscissa)  for  discriminating  square 
and  hexagon.  The  diamonds  are  data  and  lines  model  predictions.  The  dashed  line  shows  prediction  using 
Q50  giving  best  least-squares  fit.  The  solid  line  shows  prediction  using  a  smaller  Q50. 


Figure  5-8  This  figure  illustrates  the  relationship  between  target-set  contrast  and  CTF  of  the  eye 
for  the  DRI  model.  Spatial  frequency  is  in  cycles  per  meter  on  target,  not  cycles  per  milliradian  at 
the  imager.  Because  the  target  set  contains  discrimination  cues  of  all  sizes,  something  is  visible  at 
all  ranges.  Probability  of  ID  depends  on  imager  resolution. 
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Figure  5-9  This  figure  illustrates  that,  for  a  specific  target  or  a  set  of  like  targets,  discrimination 
cues  have  a  specific  frequency  spectrum.  The  imager  must  be  able  to  resolve  the  specific  cues.  At 
some  range,  the  available  cues  become  blurred,  and  probability  of  ID  drops  to  zero. 
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Figure  5-10  Figure  shows  PID  (ordinate)  versus  metric  using  character  FFT  (abscissa).  Data  is 
shown  for  both  color  and  monochromatic  (mono)  displays.  The  dashed  curve  is  prediction  using 
Q50  based  on  least-squares  fit. 
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Figure  5-11  The  figure  shows  PID  (ordinate)  and  metric  using  FFT  of  circle  (abscissa)  for 
discriminating  circle  and  hexagon.  The  diamonds  are  data.  The  dashed  line  shows  prediction 
using  Q50  giving  best  least-squares  fit. 
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Figure  5-12  The  figure  shows  PID  (ordinate)  and  metric  using  FFT  of  circle  (abscissa)  for 
discriminating  circle  and  octagon.  The  diamonds  are  data.  The  dashed  line  shows  prediction 
using  Q50  giving  best  least-squares  fit. 
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Figure  5-13  The  figure  shows  PID  (ordinate)  and  metric  using  FFT  of  hexagon 
(abscissa)  for  discriminating  hexagon  and  octagon.  The  diamonds  are  data.  The  dashed 
line  shows  prediction  using  Q50  giving  best  least-squares  fit. 
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Figure  5-14  The  figure  shows  PID  (ordinate)  and  metric  using  FFT  of  square  (abscissa)  for 
discriminating  square  and  hexagon.  The  diamonds  are  data.  The  dashed  line  shows  prediction 
using  Q50  giving  best  least-squares  fit. 
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Figure  5-15  The  FFT  of  circle,  square,  hexagon,  or  octagon  is  used  to  predict  metric 
values.  The  figure  shows  the  minimum  and  maximum  metric  values  at  each  range. 


5.2.3  Model  Modifications  to  Predict  Specific  Object  PID 


Substituting  the  spatial  frequency  content  of  discrimination  cues  for  CTgt  is  impractical 
for  many  reasons.  One  major  problem  is  defining  which  spatial  attributes  are  important 
discrimination  cues.  Another  problem  is  that  small  changes  in  scale  due  to  range  or 
orientation  have  little  effect  on  target  identification  but  have  a  significant  effect  on 
frequency  spectrum. 

A  clue  on  how  to  proceed  is  taken  from  the  DRI  model  itself.  Ctgt  is  target-set 
contrast,  not  the  contrast  of  discrimination  cues.  The  assumption  is  that  Ctgt  reflects  cue 
strength.  Fortunately,  understanding  and  quantifying  discrimination  cues  is  not  necessary; 
an  experimentally  determined  Q50  quantifies  task  difficulty  in  the  model. 

The  spatial  frequency  content  of  discrimination  cues  is  bounded  by  the  frequency 
content  of  the  target  itself;  this  constraint  is  introduced  for  Ctgt-  For  specific-object 
identification,  CTGT(^,p,Rng)  is  the  Fourier  transform  of  the  object  in  cycles  per 
milliradian  when  the  target  is  at  range  Rng.  Analyses  in  this  paper  use  MATLAB  Fast 
Fourier  Transforms  (FFT).  Q  is  now  calculated  using  Equation  5-2,  but  the  TTPF  is  still 
given  by  Equations  5-3  and  5-4. 


Q  = 
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(5-4) 


Figure  5-10  compares  Equation  5-2  predictions  to  data  from  the  character  experiment. 
The  character  6  FFT  is  used;  the  TTPF  given  by  Equations  5-3  and  5-4  does  not  change. 
The  PSQ  is  0.96  with  a  Q50  value  of  0.025.  Using  the  FFT  of  any  character  2,  3,  5,  E,  6, 
9,  a,  or  P  yields  the  same  results  within  four  significant  digits. 

Figures  5-11  through  5-14  compare  predictions  from  Equation  5-2  to  data  from 
the  shape  experiment.  The  FFT  of  circle,  circle,  octagon,  and  square  are  used  for  Figures 
5-11  through  5-14,  respectively.  The  Q50  are  0.017,  0.005,  0.007,  and  0.001  for  Figures 
5-11  through  5-14,  respectively.  For  the  shape  experiment,  predicted  and  experimental 
probabilities  are  highly  correlated;  the  PSQ  is  0.96. 

The  FFT  of  any  experiment  shape  is  used  to  predict  performance.  Metric  values 
are  calculated  at  each  range  using  the  FFT  of  circle,  hexagon,  octagon,  and  square.  Figure 
5-16  shows  the  minimum  and  maximum  Q  values  at  each  range.  Metric  variation  is  very 
small.  If  the  objects  are  similar,  use  the  FFT  of  any  object  in  the  group.  If  the  objects  are 
different,  then  the  FFT  of  the  object-of-interest  is  used. 

5.2.4  Conclusions 


Many  military  tasks  involve  locating  and  identifying  specific  objects.  Traditional  target 
acquisition  models  predict  the  average  probability  of  identifying  a  set  of  diverse  objects; 
these  models  predict  hardware  image  quality  using  a  set  of  targets  as  a  quality  standard. 
The  DRI  models  do  not  predict  specific-object  identification. 


The  character  and  shape  experiments  described  here  illustrate  the  problem  using 
DRI  models  to  predict  specific-object  identification.  DRI  model  predictions  correlate 
poorly  to  experimental  data.  This  is  because  the  objects-of-interest  are  of  certain  size 
rather  than  of  various  sizes.  The  DRI  model  assumption  that  CTgt  is  uniform  at  all  spatial 
frequencies  is  incorrect  for  specific-object  identification. 

Introducing  the  Fourier  transform  of  the  object-of-interest  into  the  DRI  model 
corrects  model  predictions.  Modified  model  predictions  are  highly  correlated  to 
experimental  data. 

Further,  the  necessary  changes  to  predict  specific-object  identification  are  minor. 
Currently,  radiometric  imagery  of  the  target  set  is  collected  in  the  field.  The  imagery  is 
segmented  to  separate  the  target  from  local  background.  The  value  of  Ctgt  is  calculated 
from  the  segmented  imagery.  To  implement  specific-target  logic,  superimpose  the 
segmented  target  image  on  the  average  background  level  and  take  an  FFT.  The  FFT  is 
then  used  in  the  model  in  lieu  of  Ctgt- 

One  aspect  of  the  model  is  more  demanding  than  DRI  model  implementation. 
Much  thought  and  user  involvement  is  needed  in  choosing  a  confuser  set  to  establish 
Q50.  The  difficulty  of  finding  and  identifying  an  object  depends  on  how  much  it  looks 
like  random  articles  in  the  scene.  User  involvement  is  required  to  establish  target-of- 
interest;  user  involvement  is  also  required  to  establish  realistic  clutter. 
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5.3  Perception  Test  for  comparing  the  performance  of  LCD  and 
CRT  monitors 

Liquid  crystal  display  monitors  and  cathode-ray  tube  monitors  use  vastly  different 
technologies.  With  their  much  smaller  size  and  lower  power  consumption,  LCD 
monitors  are  taking  the  place  of  CRT’s  in  many  applications.  Many  differences  are 
known  to  exist  in  the  performance  resulting  from  the  two  different  technologies.  The 
LCD  has  a  more  rectangular  point  spread  function,  while  the  CRT  produces  a  Gaussian 
spot.  CRT’s  can  function  well  at  many  different  screen  resolutions,  while  the  LCD  is 
limited  by  its  discrete  pixel  size.  CRT’s  are  often  capable  of  lower  minimum  luminances 
(darker  blacks)  than  LCD’s.  The  gammas  of  LCD’s  and  CRT’s  are  known  to  be 
different.  The  liquid  crystal  pixels  of  the  LCD  and  the  luminous  phosphors  of  the  CRT 
are  both  limited  by  different  time-dependent  transients.  The  contrast  of  an  LCD  can 
depend  on  the  angle  of  observation,  while  the  angle  of  observation  has  little  effect  on  the 
contrast  of  imagery  on  a  CRT.  With  so  many  differences  present,  a  comparison  of 
observer  performance  using  LCD  and  CRT  monitors  should  be  made. 

The  University  of  Memphis  has  obtained  a  Barco  Coronis  LCD  grayscale  monitor  and  a 
Barco  MGD521M  CRT  grayscale  monitor  for  use  in  perception  experiments.  Both 
monitors  have  equal  pixel  densities  (approximately  100  pixels  per  inch).  Both  monitors 
also  have  a  10  bit  grayscale  resolution.  The  monitors  have  different  color  temperatures, 
but  this  cannot  be  adjusted.  They  also  have  different  gammas.  Each  monitor  was 
adjusted  to  have  the  same  minimum  luminance  (1.2  cd/m2)  and  the  same  middle  gray 
level  luminance  (25  cd/m2). 

Five  observers  participated  in  a  number  identification  perception  experiment.  Observers 
were  shown  low  contrast  numbers  between  0  and  9  on  a  mid  gray  level  background.  Five 
different  font  levels  were  used  between  6  point  and  14  point.  No  blur  or  noise  was 
added.  This  was  done  so  that  the  limiting  MTF  of  the  systems  would  be  the  monitor 
MTF.  Observers  were  instructed  to  keep  their  heads  about  18  inches  from  the  screen 
(even  with  the  edge  of  the  desk),  but  they  were  not  physically  restrained.  Each  observer 
performed  200  trials  on  both  the  FCD  and  CRT  monitors. 
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Figure  5-16  Example  perception  test  stimulus 

After  completing  the  experiment,  observers  were  asked  which  on  which  monitor  the  task 
was  more  easily  performed.  All  agreed  that  the  task  was  slightly  easier  on  the  LCD 
monitor.  The  results  confirm  this.  The  graph  below  shows  the  average  probability  of 
correct  number  identification  as  a  function  of  the  font  size.  Before  the  curves  start  to 
saturate  a  6  to  8  percent  difference  in  performance  is  measured  between  the  curves. 


Font  size  in  points 
Figure  5-17  Perception  test  results 


As  can  be  seen  in  the  graph,  a  modest  improvement  in  performance  is  produced  when  the 
LCD  is  used.  It  should  be  noted  that  this  test  measures  the  effect  of  the  different  monitor 
MTF’s  on  performance,  but  it  does  not  measure  the  effects  of  some  other  differences 
between  CRT’s  and  LCD’s.  Static  imagery  was  used,  so  none  of  the  time  related  factors 
were  measured.  Also,  the  numbers  had  a  single  grayscale  level  against  a  middle 
grayscale  level  background,  so  monitor  gamma  did  not  have  a  significant  effect.  A 
perception  test  using  diverse  targets  with  flat  grayscale  histograms  would  be  more 
appropriate  to  test  the  performance  change  due  to  gamma  differences. 
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5 A  Intelligent  Surveillance  and  Response  Support 

This  section  details  the  progress  made  in  support  of  the  Center  for  Advanced  Sensors 
increased  focus  on  Intelligence  Surveillance  and  Reconnaissance.  Specifically,  the  initial 
efforts  to  quantify  the  temporal  and  spatial  resolution  requirements  for  tracking  human 
targets  from  low-to-mid  altitude  imagers  began  in  the  previous  quarter  of  this  year.  The 
perception  experiments  and  data  analysis  were  completed  during  the  time  period  covered 
by  this  quarterly  report.  The  details  of  the  research  development  and  the  results  are 
summarized  in  the  following  sections 

5.4.1  Simulated  Target  Detail 

It  was  determined  that  simulation  provided  the  most  efficient  and  cost  effective  method 
of  obtaining  tracking  scenarios  that  isolated  the  parameters  under  study.  Therefore,  the 
Army’s  Night  Vision  and  Electronic  Sensor  Directorate’s  (NVESD)  EOSim  urban  terrain 
simulation  tool  was  used  to  generate  all  imagery.  With  the  use  of  EOSim,  the  target’s 
speed,  starting  point,  destination,  path  and  contrast  were  under  the  control  of  the 
researchers.  Control  was  also  provided  over  the  imager’s  altitude,  down-look  angle,  field 
of  view,  path  and  trajectory.  The  researchers  were  also  able  to  control  the  ground  sample 
distance  represented  in  the  imagery,  the  amount  of  clutter  placed  in  the  scene,  the  number 
of  possible  targets  and  the  obscurants. 

For  the  purposes  of  this  research  it  was  required  that  the  system’s  resolution  be 
sufficiently  limited  so  that  only  a  few  display  pixels  represent  the  target.  This  prevented 
target  identification  and  constrained  the  focus  of  the  research  to  the  tracking  issue. 

Videos  were  then  produced  that  showed  the  simulated  target  moving  throughout  the  site 
on  a  frame  by  frame  basis.  Each  video  frame  was  a  640  x  480  pixel  LWIR  still  image 
with  white  hot  polarity.  An  example  of  the  imagery  is  shown  in  Figure  5-18. 
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Figure  5-18  EOsim  Simulated  Urban  Terrain.  The  dark  box  around  indicates  the  position  of  the 
target.  The  figure  represents  LWIR  imagery  with  white  hot  polarity. 


5.4.2  Perception  Experiments 

For  the  purposes  of  this  research,  the  term  tracking  is  used  to  refer  to  the  act  of  following 
a  target  from  a  starting  point  to  a  destination.  Success  will  be  defined  as  correctly 
identifying  the  final  destination  of  a  specified  target.  A  good  example  of  a  representative 
task  could  come  from  the  non-real-time  analysis  of  persistent  surveillance  video  where  it 
may  be  desirable  to  determine  the  starting  point  of  a  target  in  close  proximity  to  some 
event  of  interest.  Thus,  the  analyst  would  search  the  surveillance  data  in  reverse  and 
carefully  take  note  of  the  targets  position  until  the  starting  point  is  determined. 

The  target,  its  starting  point,  destination,  target  path  and  a  unique  set  of  operating 
conditions  combine  to  create  a  distinct  scenario.  The  blanket  term  “operating  conditions” 
is  used  to  represent  the  combination  of  a  number  of  different  variables.  These  variables 
include  the  contrast  of  the  target,  the  target  speed,  the  number  of  similar  looking  objects 
in  the  scene  (or  clutter),  the  band  of  the  imager  (LWIR,  SWIR,  Visible)  etc.  It  was 
determined  that  30  distinct  scenarios  were  required  to  sufficiently  represent  the  tracking 
task  with  respect  to  the  variables  under  study. 


5.4.3  Scenario  Development 

Extensive  consideration  of  the  problem  led  to  the  conclusion  that  anything  beyond  a 
trivial  scenario  (one  target  and  little  to  no  clutter)  would  make  it  difficult  to  extract  useful 
and  traceable  cause  and  effect  relationships.  This  was  crucial  to  the  initial  development 
of  this  line  of  research  and  the  subsequent  follow  on  stages.  Therefore,  only  one  target  at 
a  time  was  tracked  in  these  initial  experiments,  confusers  were  effectively  removed,  and 
clutter  was  provided  by  the  environment  itself  i.e.  roads. 

Representative  operating  conditions  were  determined  after  consultation  with  NVESD 
personnel.  The  conditions  chosen  are  as  follows.  The  average  marching  speed  of  a 
soldier  was  estimated  at  approximately  1.5  m/s.  The  imager  altitude  was  2250  meters 
above  the  simulated  environment.  The  number  of  possible  destination  buildings  was  set 
at  15  by  the  simulation  site. 

To  prevent  the  introduction  of  multiple  frame  rates  per  scenario  in  each  experiment, 
target  movement  was  restricted  to  those  sections  of  the  site  that  were  completely  visible 
to  the  camera  up  until  entrance  into  the  destination  building.  The  target  would  enter  one 
destination  building  only.  Upon  entrance  into  that  building,  the  target  would  not  re-exit. 

The  set  of  ground  sample  distances  tested  were  as  follows:  .5m,  lm,  2m  and  4m.  The 
corresponding  frame  rates  tested  were  8  fps,  4  fps,  1  fps,  .25  fps,  and  .125  fps.  The 
combined  effect  of  these  two  variables  on  human  performance  was  determined  by 
completion  of  the  following  experimental  matrix 
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Resolution 

8  fps 

4  fps 

2.0  fps 

lfps 

.5  fps 

4m 

X 

X 

X 

X 

X 

2.0  m 

X 

X 

X 

X 

X 

1  m 

X 

X 

X 

X 

X 

,5m 

X 

X 

X 

X 

X 

Table  5-7  Human  Tracking  Performance  Experimental  Matrix 


Each  cell  in  the  experimental  matrix  represents  a  temporal  and  spatial  resolution 
combination.  There  were  10  distinct  scenarios  assigned  to  each  combination.  The 
experiments  were  segmented  according  to  spatial  resolution  or  ground  sample  distance. 
Therefore,  each  row  in  the  matrix  represents  a  separate  experiment. 

5.4.4  Results  and  Conclusions 

Intuitively,  it  was  expected  that  there  would  be  a  direct  correlation  between  smaller 
ground  sample  distances,  higher  frame  rates  and  increased  tracking  performance.  On  the 
most  basic  level,  the  speed  of  the  target  directly  influences  the  possible  travel  distance  (or 
uncertainty)  between  frames  and  larger  ground  sample  distances  result  in  reduced  imager 
capability  in  indicating  the  presence  of  the  single  human  target.  The  collected  data  is 
shown  in  Table  5-8. 


Corrected  PCDD 

Frame  Rate 

Resolution 

8  fps 

4  fps 

1  fps 

.25  fps 

.125  fps 

4  meter 

0 

0 

0 

0 

0 

2  meter 

0.1 

0.408 

0.323 

0.085 

0.015 

1  meter 

0.985 

1 

0.985 

0.754 

0.538 

.5  meter 

0.977 

0.992 

0.977 

0.838 

0.508 

Table  5-8  Table  showing  the  corrected  probability  of  tracking  a  target  to  the  destination. 

Table  5-8  shows  the  resultant  probability  that  occurs  when  the  Correct  Destination 
Determination  Ratio  (CDDR)  is  corrected  to  remove  the  effects  of  guessing.  This  result 
will  be  defined  as  the  track  probability.  It  is  calculated  as  follows: 

_  i_  CDDR  -  P(guess) 

Corrected  Track  Probability  = - — - — r — -  (5-5) 

l-P(guess) 

Where  P(guess)  is  defined  as  the  probability  of  guessing  the  correct  destination  purely  by 
chance.  For  this  research  P(guess)  =  1/15  to  account  for  all  destination  buildings  in  the 
simulation  site. 


The  Correct  Destination  Determination  Ratio  (CDDR)  is  calculated  by 


CDDR 


#  of  correct  answers  per  combination 


(5-6) 


#  of  scenarios  per  combination 
From  the  experimental  results  represented  in  Tables  5-4  through  5-7,  we  can  conclude 
that  the  4  meter  and  2  meter  ground  sample  distance  are  inappropriate  for  tracking 
humans  under  the  conditions  tested.  The  experimental  results  indicate  that  a  50  percent 
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corrected  track  probability  is  not  reached  for  any  combination  of  these  ground  sample 
distances  and  any  frame  rate  under  consideration. 

A  huge  increase  in  tracking  performance  is  evident  as  the  imager  resolution  is  increased 
from  2  meters  to  1  meter.  Specifically,  track  probabilities  of  greater  than  50%  were 
achieved  for  all  frame  rates  tested.  This  is  in  direct  contrast  to  the  2  meter  results  where 
50%  probability  was  not  achieved.  An  approximate  accuracy  of  75%  was  realized  at  .25 
fps  and  the  result  approach  unity  for  frame  rates  of  1  fps  and  above.  Similar  results  are 
observed  in  the  .5  meter  experiment.  Thus,  according  to  the  data  collected  it  can  be 
concluded  that  1  meter  imager  ground  sample  distance  is  sufficient  for  tracking  a  moving 
human  target. 

The  development  of  the  imager  frame  rate  requirement  was  not  as  straightforward  as  that 
of  the  spatial  resolution.  This  is  due  to  the  fact  that  temporal  resolution  system 
specification  is  an  inherent  function  of  target  path  and  the  characteristics  of  the  terrain. 
Therefore,  steps  were  taken  to  ensure  that  the  effects  of  both  were  considered  in  this 
research  and  for  future  resolution  developments.  To  accomplish  this,  an  intermediate 
variable  called  the  track  distance  was  be  used  to  encapsulate  these  characteristics. 
Specifically,  it  was  used  to  indicate  the  uncertainty  distance  (maximum  possible  distance 
traveled  between  frames)  that  results  in  a  50%  probability  of  a  correct  track. 


The  track  distance  for  the  simulation  site  was  determined  from  examination  of  the 
collected  data.  From  inspection  of  the  data,  the  track  probability  drops  to  approximately 
50%  when  the  frame  rate  is  decreased  to  .125  frames  per  second  (1  frame  every  8 
seconds)  for  both  the  .5  and  1  meter  resolutions.  Note  that  the  data  from  these  resolutions 
were  chosen  because  they  both  satisfy  the  tracking  criteria  established  for  the  imager’s 
spatial  resolution.  Thus,  the  track  distance  can  be  determined  to  be  approximately  12 
meters/frame.  That  track  distance  can  now  be  used  to  calculate  the  frame  rate  required  to 
maintain  50%  tracking  probabilities  for  other  target  velocities  according  to  the  following 
equation. 


^PSreq  track  distance  (5"7) 

The  results  of  these  experiments  have  been  written  in  journal  format  and  submitted  to  the 

Journal  of  Optical  Engineering. 
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6.0  Performance  Modeling  of  Sensors  to  Reduce  Infrared 
Signatures 

Period  Covered  1  June  2006  to  31  May  2007  by  Dr.  Edward  H.  Perry 

As  intelligence,  surveillance,  and  reconnaissance  (ISR)  systems  take  on  large  and  larger 
roles  in  the  field,  it  becomes  more  and  more  imperative  to  minimize  the  ability  to  detect 
American  assets  in  combat  zones.  These  assets  include  ground  vehicles,  aircraft,  and 
various  ground-based  ISR  imaging  systems.  Since  all  vehicles,  aircraft,  and  ISR  systems 
are  powered,  they  generate  heat  and  consequently  present  a  potentially  detectable  thermal 
energy  signature.  Because  the  devices  operate  at  temperatures  close  to  that  of  the  ambient 
environment,  their  thermal  signatures  lie  almost  completely  in  the  infrared  (IR)  portion  of 
the  electromagnetic  spectrum.  By  reducing  the  temperature  and/or  the  IR  emissivity  of 
exposed  surfaces,  thermal  signatures  can  be  significantly  reduced. 

6.1  Methods  to  Reduce  the  Infrared  Signature  of  Ground-Based 
Combat  Assets  Using  Passive  Methods. 

6.1.1.  Overview 

This  project's  goal  is  to  examine  various  methods  for  reducing  the  IR  signatures  of 
exposed  surfaces  employed  on  military  assets  to  minimize  the  detection  and  recognition 
of  these  assets  by  infrared  imaging  systems.  Methods  to  be  addressed  include  those  that 
lower  the  IR  emissivity  of  the  surfaces  and  those  that  lower  the  surface  temperatures. 

6.1.2.  Activities  to  date 

The  work  presented  herein  provides  the  foundation  for  development  and  testing  of 
passive  methods  which  aim  to  reduce  the  infrared  signatures  of  various  military  assets. 
During  the  past  year  materials  with  a  low  IR  emissivity  were  explored.  One  material,  a 
low-emissivity  tape  available  in  the  brown,  green,  and  tan  colors  employed  by  the 
military  for  camouflage,  was  explored  in  detail.  The  tape,  manufactured  by  OptiCorp, 

Inc.,  integrates  a  highly  reflecting,  low  emissivity  coating  with  a  visual  pigmentation 
layer  such  that  the  resulting  material  exhibits  low  emissivity  in  the  infrared,  while 
maintaining  visual  and  near  infrared  reflectance  characteristics  consistent  with 
conventional  camouflage  materials. 

6. 1.2.1  Experimental  Measurements 

A  simple  and  reliable  method  was  developed  to  measure  the  emissivity  of  surfaces.  An 
experimental  apparatus  to  implement  this  method  was  designed,  constructed  and 
validated.  The  experimental  apparatus  is  comprised  of  four  major  components:  an 
infrared  thermometer  (IRT),  a  heated  plate  assembly,  a  temperature  controller,  and 
thermocouple  readout  device  as  shown  in  Figure  6-1. 

The  heated  plate  assembly  consists  of  a  flexible  silicone  rubber  heater,  a  one  inch  thick 
aluminum  plate,  and  a  test  plate  all  in  series.  The  assembly  is  held  together  with  high 
temperature  thermal  conductive  paste.  Not  only  does  the  paste  hold  the  plates  and  the 
heater  together,  it  also  helps  maintain  a  uniform  temperature. 
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The  temperature  of  the  plate  was 
measured  with  nine  thermocouples 
attached  to  the  plate  in  a  3  x  3  grid  pattern. 
This  setup  verified  that  the  temperature 
across  the  plate  was  uniform  and  steady. 
The  temperature  was  controlled  with  a 
thermocouple  on  the  plate  connected  to 
the  temperature  controller.  Once  the 
desired  temperature  on  the  surface  of  the 
plate  was  reached,  the  heater  cycled  on/off 
to  maintain  the  specified  temperature 
input. 


By  adjusting  the  emissivity  setting  on  the 
IRT  until  the  latter  indicated  the  same 
temperature  as  the  thermocouple  array,  the 
emissivity  of  the  heated  surface  was 
determined.  Excellent  agreement  was  found  between  measured  values  and  those  reported 
in  the  literature  for  a  number  of  surfaces. 


To  measure  the  effectiveness  of  using  low-emissivity  surfaces,  the  experimental 
apparatus  described  above  was  employed.  A  15  cm  square  aluminum  plate  was  first 
painted  with  flat-black  paint.  Then  a  "woodland"  camouflage  pattern  was  generated  using 
the  low-emissivity  tape  discussed  earlier  for  the  brown,  green,  and  tan  colors.  This  plate 
is  shown  in  Figure  6-2. 


Thermocouples  were  attached  to  the  plate 
in  regions  representing  each  of  the  four 
colors.  The  plate  was  heated  to  300  K, 
325K,  and  350  K,  and  LWIR  images  were 
obtained  using  a  FLIR  Systems 
ThermoVision®  A40M  infrared  camera. 

Two  of  the  images  obtained  are  shown  in 
Figure  6-3  for  plate  temperatures  of  325 
K  and  350  K.  Only  the  black-painted 
areas,  which  appear  white  in  the  image, 
produce  much  luminance.  The  green, 
brown,  and  tan  areas  all  display  low 
luminance  levels  as  would  be  expected 
for  such  low-emissivity  surfaces. 

Clearly  the  thermal  signature  of  an  asset 
covered  with  such  an  IR  camouflage 


Figure  6-2  Aluminum  plate  with  woodland 
camouflage  pattern  formed  with  flat  black  paint 
and  low-emissivity  tape. 
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pattern  would  be  much  lower  than  the  signature  of  the  same  asset  painted  with  standard 
high-emissivity  paint. 


Figure  6-3  Thermal  images  (LWIR)  of  test  plate  at  325  K  and  350K.  Low-emissivity  tape 
was  used  to  cover  all  areas  except  the  black  (appearing  as  white  in  the  images). 


6.2  Modeling  the  Effectiveness  of  Signature-Reduction 
Applications  in  the  Infrared 

6.2.1.  Overview 

Mathematical  models  were  developed  for  the  following  two  strategies  that  can  be 
employed  to  reduce  thermal  signatures: 

1)  employ  surfaces  with  low  IR  emissivities 

2)  employed  extended  surfaces  to  reduce  surface  temperatures 

The  ultimate  objective  is  to  demonstrate  the  effectiveness  of  various  signature-reduction 
techniques  by  synthesizing  thermal  images  of  assets  employing  these  techniques 
alongside  images  of  the  same  asset  without  thermal  signature-reduction. 


6.2.2.  Activity  to  Date 

From  the  well  known  Planck  distribution  law  (Incropera  &  DeWitt,  1996)  the  rate  at 
which  a  unit  area  of  a  perfectly  black  surface  emits  radiant  energy  per  unit  wavelength 

(spectral  emissive  power)  is  given  by 

J7  -  ^1 

[6.1] 
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where  Ci  =  3.74  x  108  W-pm4/m2  and  C2=  1.44  x  104  pm-K.  For  a  non-black  surface,  the 


emissive  power  at  a  given  wavelength  is  given  by  a  similar  expression 
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where  s^is  the  "spectral  hemispherical  emissivity"  of  the  surface.  The  emissivity  is  a 
measure  of  a  surface's  ability  to  emit  thermal  radiation  compared  to  the  ideal  emitter, 
which  is  a  blackbody.  Thus,  0  <  &x  <  1  for  all  surfaces,  and  for  most  surfaces  the 
emissivity  varies  with  wavelength.  In  Fig  6-4  the  emitted  radiant  energy  for  a  blackbody 
at  300  K  is  shown  along  with  that  and  for  a  non-black  surface  at  325  K  with  a  "selective" 
emissivity  given  by 

=  .1  for  X  <  3  jll  and 
&x  =  0.685  for  X  >  3  p  . 

Clearly,  the  spectral  characteristics  of  the  two  surfaces  are  very  similar.  The  total  energy 
in  the  8-12  p  wavelength  range  (LWIR)  is  within  1%  for  the  two  surfaces.  Thus,  this 
"selective"  non-black  surface  at  325  K  would  appear  very  similar  to  a  "black" 
background  at  300K  in  an  MWIR  or  an  LWIR  imaging  system. 


0  5  10  15  20  25  30 


Wavelength, E(  (Dm) 


Fig  6-4  Comparison  of  the  radiant  energy  emitted  by  a  black  surface  at  300K  and 
a  selective  surface  (z^  =  0.685  for  X  >  3uml  at  325  K. 

A  simple  way  to  reduce  the  temperature  of  a  surface  is  to  use  extended  surfaces,  or  fins, 
attached  to  the  surface  as  shown  in  Fig.  6-5.  This  can  be  demonstrated  easily  by 
considering  the  following  relation,  which  describes  the  heat  transfer  Q  from  a  surface  to 
its  surroundings  by  both  convection  and  radiation: 

Q=A,{h(TrTJ  +  as,(T;-T:)}  [M1 

where  As  is  the  surface  area,  Ts  the  surface  temperature,  Ta  the  ambient  temperature,  h  the 
convective  heat  transfer  coefficient,  a  the  Stefan-Boltzmann  constant,  and  ss  the  IR 
emissivity  of  the  surface. 
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Often  Eq.  6-3  is  written  simply  as 


Q=A  h(T-TJ  [M1 

where  h  is  the  "overall"  heat  transfer  coefficient  (convective  and  radiative)  defined 
as 


h  =  h  +  a£fl  T']  =  h  +  as,(T,2  +T;)(TS  +TJ 

1  s  1  a 

Rearranging  Eq.  6-4,  we  find 


[6-5] 


[6-6] 


Clearly,  for  a  given  heat  transfer  rate  Q ,  the  temperature  of  the  finned  surface  decreases 
with  increasing  surface  area,  provided  the  heat  transfer  coefficient  remains  unchanged. 
However,  Rao  et  al.  [2006]  have  shown  that  this  coefficient  depends  on  a  number  of 
parameters,  including  the  fin  spacing  S  and  fin  height  H : 


A 


h  =  0.102  —  Ra"5b  — 

H 


QAf\  +  sx^ 


V 


1 +  N 


N 


-0.04 


R  J 


[6-7] 


where  /c/  is  the  thermal  conductivity  of  the  air,  £1  the  IR  emissivity  of  the  fin,  and  N  the 
number  of  fins.  Nr  accounts  for  radiative  heat  loss  from  the  fins,  and  Ra  is  the  Rayleigh 
number  associated  with  the  natural  convection  from  the  fins.  These  two  non-dimensional 
parameters  are  defined  as 
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where  a  is  the  Stefan-Boltzmann  coefficient,  g  the  gravitational  acceleration,  J3  the 
thermal  expansion  coefficient  of  the  air,  v^the  viscosity  coefficient  of  the  air,  aw  the 
thermal  diffusivity  of  the  air  at  the  fin  temperature,  and  Gr  is  the  Grashof  number  defined 
as  Gr  =  Ra  ajvf. 


From  Eq.  6-7  it  is  readily  apparent  that  the  heat  transfer  coefficient  decreases  with 
decreasing  fin  spacing  and  with  increasing  fin  height.  This  occurs  because  the  thermal 
boundary  layers  on  adjacent  fms,  which  increase  in  thickness  along  the  vertical 
dimension  of  the  fin,  overlap  at  small  fin  spacings  and  large  fin  heights,  resulting  in 
interference  between  the  two  fms. 


Analyses  were  performed  for  arrays  of  rectangular  fins  attached  to  a  heated  1 5  cm  square 
horizontal  surface  that  must  transfer  10.7  W  (444  W/m2)  to  an  environment  at  300  K.  It 
was  assumed  that  both  the  plate  and  fms  were  nearly  thermally  black  (i.e.  sx  =  0.90).  Fin 
heights  of  25,  38,  51,  and  76  mm  were  considered  along  with  fin  spacings  ranging  from 
1mm  to  27  mm. 


Figure  6-6  shows  the  results  obtained  for  steel  fins  (k  =  57  W/m-K).  First,  it  is  readily 
apparent  that  the  average  temperature  of  the  finned  array  is  considerably  less  than  the 
temperature  required  by  the  plate  alone  (350  K)  to  transfer  the  same  amount  of  heat  from 
the  plate.  Also,  an  optimum  is  observed  at  around  25  fins,  corresponding  to  a  fin  spacing 
of  about  3  mm.  This  occurs  because  of  the  interference  between  adjacent  fms  discussed 
above.  Above  about  25  fms  the  gains  obtained  by  larger  surface  areas  are  offset  by 
reductions  in  the  heat  transfer  coefficient. 
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150  mm  x  150  mm  Steel  Plate 
Q/A  =  444  W/m2 


Figure  6-6  Fin  Array  Average  Temperature  as  a  function  of  the 
Number  of  Fins  Attached 

From  a  thermal  signature  reduction  perspective,  temperatures  are  vitally  important  since 
the  emittance  of  a  surface  varies  as  the  fourth  power  of  the  surface  temperature.  Figure  6- 
7  shows  a  simulated  LWIR  image  of  an  array  employing  steel  fins.  Black  represents  a 
temperature  of  300  K  and  white  a  temperature  of  350  K.  The  thermal  signature  presented 
by  this  array  is  clearly  lower  than  it  would  be  for  the  plate  alone. 
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Figure  6-7  Simulated  LWIR  image  of  a  10-fin  array  employing  51- 
mm  high  fins.  Mean  temperature  of  the  array  is  320  K 
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6.3  Performance  Modeling  of  Persistent  ISR  Imaging  Systems 
in  Adverse  Environments 

6.3.1.  Overview 

This  research  centers  on  modeling  the  performance  of  persistent  ISR  imaging  systems  in 
adverse  environments  such  as  those  present  in  secure  enclosures  in  the  hot  desert  where 
temperatures  in  excess  of  120  °F  can  be  expected  and  those  present  for  air-borne  imaging 
systems  at  altitudes  of  25,000  to  50,000  feet,  where  temperatures  ranging  from  -30  °F  to 
-70  °F  are  normal. 

6.3.2  Activities  to  Date 

In  a  hot,  dry  desert  environment  the  relative  humidity  of  the  ambient  air  is  quite  low, 
typically  below  10%  in  the  afternoon.  The  temperature  of  the  air  can  be  reduced 
considerably  by  spraying  water  into  the  air,  the  principle  behind  "swamp  coolers."  This 
process,  known  as  adiabatic  saturation  or  evaporative  cooling,  can  reduce  the  air 
temperature  by  15-25  Kelvins  (30  -  40  °F),  thus  making  the  air  much  more  suitable  for 
cooling  temperature-sensitive  electronic  equipment.  Theoretically,  an  evaporative  cooler 
can  bring  the  air  temperature  down  to  the  "wet-bulb"  temperature  of  the  air. 

A  typical  evaporative  cooler  is  shown  in  Fig.  6-8.  Hot,  dry  outdoor  air  is  drawn  through 
pads  which  are  saturated  with  water.  Water  evaporates  from  the  pads  and  humidifies  the 
air.  The  heat  required  for  the  evaporation  is  drawn  from  the  incoming  air,  causing  it  to 
cool.  The  air  exiting  the  unit  can  be  used  to  cool  electronic  equipment. 

The  efficiency  8  of  an  evaporative  cooling  system  is  defined  as: 


8  = 


T  -  T 

±  DB,in  ±  DB,out 

T  -  T 

1  DB.in  1  WB.in 


[6-8] 


where  TDb  is  the  "dry-bulb"  temperature  of  the  air  and  TWb  is  the  "wet-bulb"  temperature. 
The  wet-bulb  temperature  is  the  lowest  possible  temperature  that  can  be  reached  by  an 
evaporative  cooling  unit.  Evaporative  cooling  units  typically  exhibit  efficiencies  on  the 
order  of  85%.  Rearranging  Eq.  6-8,  we  find  that  the  temperature  TDB>out  of  the  air  leaving 
an  evaporative  cooler  is: 


[6-9] 
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Figure  6-8  Typical  configuration  of  an  evaporative  cooler 

Figure  6-9  shows  the  exiting  dry-bulb  temperature  as  a  function  of  the  relative  humidity 
of  the  incoming  air,  which  is  assumed  to  be  at  a  temperature  of  3 17  K  (1 10  F).  For  10% 
incoming  relative  humidity,  a  leaving  temperature  of  297  K  (75  F)  is  predicted  for  a 
cooler  with  an  efficiency  of  85%,  a  drop  of  20  Kelvins  (35  °F). 


Although  evaporative  cooling  units  do  require  a  supply  of  water,  a  scare  commodity  in 
most  desert  environments,  the  water  requirements  are  modest.  For  example,  1  kg  of  water 
can  cool  90  kg  of  air  (approximately  80  cubic  meters)  by  20  Kelvins,  the  equivalent  of 
1.6  million  Joules.  Thus,  an  evaporative  cooling  unit  dissipating  500  W  of  heat  from  an 
electronics  system  in  a  3 17  K  environment  (1 10  F)  would  require  approximately  one  liter 
of  water  per  hour  while  allowing  the  equipment  to  operate  at  a  temperature  that  is  20 
Kelvins  cooler. 


Evaporative  Cooler 
TDB,in  =  317  K  (1 10  F),  H  =  85% 


Figure  6-9  Cooling  performance  of  an  85%  efficient  evaporative 
cooling  system  operating  at  various  inlet  humidity  levels 
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7.0  Final  Report  for  Center  for  Advanced  Sensors  -  Vanderbilt 
University 

Period  Covered ,  1  April  2005  to  31  Oct  2007,  US  Army  Contract  W91  INF- 
05-2-001 9,  By  Profs .  J.  L  Davidson,  A.  B.  Bonds,  W.  P.  Kang 

7.1  Administrative 

The  Vanderbilt  Program  in  support  of  the  Center  for  Advanced  Sensors  was 
completed  on  schedule  and  spend  plan.  Figure  7.1-1  below  summarizes  the  projects 
engaged  by  Vanderbilt  in  support  of  the  CAS  program.  As  the  funding  level  was  reduced 
for  the  Vanderbilt  subtask  in  Year  2,  the  task  re:  “Bio-Optic  Sensor  Electrode 
Development”  took  a  diminished  role  and  certain  other  tasks  were  reduced.  The 
discussion  and  figures  that  follow  are  results  for  the  program  portion  of  the  Vanderbilt 
tasks.  Some  of  this  information  was  presented  in  previous  interim  reports  and  is  collected 
here  for  this  report.  As  Year  2  came  to  a  close  and  the  Vanderbilt  funds  were  terminated, 
this  is  the  final  report. 


Center  for  Advanced  Sensors 

Davidson/Bonds  Vanderbilt  University 


>  Bio-Optics  of  Vision 

>  Bio-Optic  Sensor  Electrode  Development 

>  IR  Display  Development 


Figure  7.1  -  1  Summary  of  projects  engaged  by  Vanderbilt  in  support  of  the  CAS  program 

7.2  Bio-Optics  of  Vision  -  Professor  A.B.  Bonds 
7.2.1  Overview 

Positioning  of  Vanderbilt ’s  biological  vision  initiative  to  ARL-NV goals 
Our  work  is  directed  at  identifying  mechanisms  used  by  the  brain  for  object  segregation. 
A  principal  symbiosis  appears  to  be  in  the  area  of  search.  Strategies  used  by  the  brain  in 
segmenting  salient  visual  structures  from  background  clutter  may  prove  useful  in 
improving  classification  across  large  numbers  of  images.  These  strategies  may  also  be 
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applied  to  enhance  human  performance  in  direct  recognition  tasks.  The  importance  of 
testing  these  results  via  integration  with  modeling  is  noted. 

7.2.2  Objective 

Research  natural  (living  organism)  visual  sensory  representation  as  performed  by 
neural  assemblies,  recording  isolated  neural  activity  across  a  network  of  cells.  Determine 
the  coding  of  visual  signals  by  cell  populations  and  achieve  a  working  view  of  the  neural 
code  to  achieve  superior  electronic  approaches  to  rapid  AI  night  vision  image  signal 
processing.  Our  fundamental  hypothesis  is  that  global  structure  is  synthesized  in  the 
brain  by  the  dynamic  formation  of  cellular  assemblies  identified  by  firing  that  is 
synchronous  on  a  msec  scale,  the  Binding  By  Synchrony  (BBS)  model.  Testing  and 
analyses  are  directed  toward  validation  of  this  hypothesis.  The  first  section  (7.2.3. 1- 
7. 2. 3. 5)  analyzes  the  general  characteristics  of  firing  synchrony  and  the  second  section 
(7. 2. 3. 6-7. 2. 3. 9)  explores  the  relationship  between  synchrony  and  contour  detection. 

These  topics  engage  multiple  disciplines,  including  (1)  visual  neuroscience,  (2)  imaging, 
processing  and  detection,  (3)  automated  decision  making/classification  and  (4)  statistical 
analysis. 

7.2.3  Results 

7.2.3. 1  Frequency  of  occurrence  of  synchronous  groups 

The  importance  of  cellular  assemblies  defined  by  synchrony  in  the  process  of 
object  cognition  is  dependent  on  how  often  such  assemblies  arise  in  the  process  of  normal 
vision.  A  high  frequency  of  occurrence  would  suggest  that  the  mechanism  is 
fundamental  to  brain  function.  We  presented  4500  natural  images,  including  250  images, 
each  rotated  at  18  angles  (20°  to  360°,  in  20°  increments).  Each  image  subtended  20°  x 
20°  and  was  jittered,  star  fashion,  at  60  Hz  for  0.5  sec.  The  images  were  displayed  in  a 
random  order  followed  by  0.5  s  of  mean  luminance.  The  entire  sequence  was  shown  10 
times  (in  different  random  orders)  and  pictures  were  scaled  to  have  the  same  global 
contrast.  Fifty  groups  each  of  2-8  cells  were  selected  at  random  for  analysis,  with  some 
preference  for  groups  covering  large  areas  in  space.  Group  responses  were  quantified, 
averaged  across  all  10  trials,  weighted  by  their  scores,  and  summed  to  produce  a 
composite  of  the  average  response.  We  have  found  that  any  arbitrary  group  of  8  cells  or 
less  will  reliably  synchronize  to  some  of  the  4500  images  (Figure  7.2-1),  and  some 
groups  of  9-10  cells  will  repeatedly  synchronize  to  a  few  (e.g.,  5/4500)  particularly 
complex  images.  While  the  fraction  of  stimuli  that  drive  an  assembly  drops  as  assembly 
size  grows,  this  is  a  natural  consequence  of  increased  selectivity  of  the  higher  order 
assemblies  to  more  complex  visual  features  (see  below).  The  number  of  potential 
assemblies  grows  geometrically  with  assembly  size,  markedly  increasing  the  likelihood 
that  any  image  will  trigger  the  formation  of  at  least  one,  and  more  probably  several, 
assemblies. 
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A. 


Synchrony 


B. 


Average  Firing  Rate 


Assembly  Size  Assembly  Size 


Figure  7.2  - 1  Average  fraction  of  4500  stimuli  that  produce  responses  greater  than 
1.96(stdev)  above  chance  as  a  function  of  assembly  size  for  synchrony  (A)  and  firing  rate  (B). 

1.23.2  Selectivity  of  cellular  assemblies. 

In  a  pilot  experiment,  we  measured  the  kurtosis  and  population  and  lifetime 
sparseness  of  synchrony  and  firing  rate  response  distributions  of  a  set  of  cells  viewing  a 
natural  image  sequence  for  the  purpose  of  establishing  the  degree  of  selectivity  of 
particular  assemblies.  Three  hundred  fifty  (350)  assemblies  were  chosen  randomly  for 
analysis  with  50  assemblies  per  size  group  (2-8  cells).  Fig  7.2-2  plots  the  synchrony  (or 
firing  rate)  normalized  to  a  maximum  of  1  against  the  probability  of  that  value  of 
synchrony  (or  firing  rate)  across  all  natural  image  presentations.  The  results  show  that 
synchrony  response  distributions  were  highly  leptokurtic  (selective  with  only  a  few 
images  generating  high  responses  in  the  right  tail  of  the  distribution)  and  kurtosis 
increased  nearly  exponentially  with  group  size.  This  means  that  for  successively  larger 
assemblies,  there  were  fewer  and  fewer  images  that  generated  large  responses.  On  the 
other  hand,  the  kurtosis  for  average  firing  rate  response  distributions  decreased  slightly 
across  assembly  size.  This  suggests  that  increasing  the  assembly  size  enhances  the  ability 
of  synchronous  assemblies  to  discriminate  among  the  natural  image  sequence  while  it 
hinders  the  selectivity  for  a  firing  rate  code,  which  becomes  highly  redundant  and 
confounding.  Note  that  the  assembly  in  the  image  with  the  largest  synchrony  response 
from  Figure  7.2-2  tends  to  lie  on  a  single  complex  contour,  suggesting  that  cellular 
assemblies  are  important  in  the  process  of  contour  integration.  This  finding  is  explored 
further  in  7.2.3. 7 

Population  and  lifetime  sparseness  were  computed  using  the  Rolls  &  Tovee  (7.4- 
31)  measure  as  modified  by  Willmore  &  Tolhurst  (7.4-45).  For  a  single  assembly  size 
(Figure  7.2-3),  synchrony  responses  are  much  sparser  than  firing  rate  responses  and  this 
difference  increases  when  assembly  size  grows.  This  means  that  for  our  stimulus  set, 
assemblies  of  2-8  cells  exhibited  high  population  sparseness,  where  only  a  small  number 
of  groups  out  of  50  (per  assembly  size)  responded  strongly  to  a  single  image,  and  high 
lifetime  sparseness,  where  each  assembly  only  responded  strongly  to  a  few  images  out  of 
the  set  of  all  images  (4500).  Firing  rate  responses  became  less  sparse  as  each  new 
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member  contributed  more  images  to  the  response  set.  The  firing  rate  results  are  expected 
to  change  when  requiring  assembly  members  to  fire  above  a  threshold,  but  preliminary 
analyses  show  that  they  will  still  be  significantly  less  sparse  than  synchrony  responses. 
Overall  this  suggests  that  image  structure  is  fundamentally  coded  by  the  identity  of  the 
particular  members  of  the  synchronous  assemblies  rather  than  the  relative  activities 
across  the  population  as  a  whole.  A  more  detailed  investigation  of  sparse  coding  deriving 
from  these  pilot  observations  is  found  in  7. 2. 3. 9. 


A.  Synchrony  Distributions  B.  Firing  Rate  Distributions 


Figure  7.2  -  2  (A)  Sample  synchrony  and  (B)  average  firing  rate  response  distributions  for 
assemblies  of  2,  4,  6,  and  8  cells.  The  kurtosis  of  the  synchrony  distributions  increases 
nearly  exponentially  with  assembly  size  (the  red  and  yellow  bars  extend  beyond  this  figure, 
which  was  cropped  to  show  the  range  of  responses).  The  green  bars  show  the  detailed 
distribution  of  the  highest  responses  for  a  group  of  four  cells,  and  the  insets  show  the  images 
with  the  largest  synchrony  and  firing  rate  for  the  same  assembly  of  4  neurons. 
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Population  Sparseness 


A. 


B.  Lifetime  Sparseness 


Figure  7.2  -  3  (A)  Population  sparseness  as  a  function  of  assembly  size.  Population 
sparseness  describes  the  response  distribution  of  many  assemblies  to  one  image.  (B) 

Lifetime  sparseness  as  a  function  of  assembly  size.  Lifetime  sparseness  describes  the 
response  distribution  of  one  assembly  to  many  images. 

7.2.3.3  Reliability  of  cross-correlation  measures 

In  the  course  of  presenting  our  results  and  publishing,  we  have  confronted  a  good 
deal  of  concern  regarding  the  primary  method  by  which  we  quantify  neural  interactions 
(the  Joint  Post-Stimulus  Time  Histogram  (JPSTH)  of  Aertsen,  et  al,  1989,  7.4-2).  There 
is  considerable  debate  on  the  problem  of  quantifying  synchrony  because  of  many 
influences  on  firing  patterns.  Our  conclusions  depend  on  reliable  estimates  of  stimulus- 
dependent  changes  in  the  correlation  magnitude,  and  we  must  prove  that  these  changes 
represent  true  changes  in  effective  connectivity.  There  are  two  main  issues:  (1)  How 
changes  in  average  firing  rate  are  normalized  and  (2)  How  nonstationary  firing  patterns 
are  interpreted. 

Normalization.  The  basic  criticism  of  the  JPSTH  method  of  Aertsen  et  al  (1989,  7.4-2) 
is  that  it  was  calibrated  empirically  at  lower  firing  rates.  For  the  positive  correlations  that 
we  study  interaction  is  well-approximated  by  the  normalized  correlation  coefficient  until 
the  probability  of  a  spike  per  bin  (p)  exceeds  about  0.3  (Ito  &  Tsuji  (2000),  7.4-22),  their 
Fig  4).  We  record  from  a  large  group,  so  any  given  stimulus  is  suboptimal  for  most  cells 
and  firing  rates  are  lower  than  found  with  stimuli  artificially  tailored  to  single  units. 
Across  a  sample  of  1 1 1  cells,  fewer  than  1%  of  total  spikes  had  a  p  >0.3  (see  Methods  for 
details),  suggesting  that  for  nearly  all  of  our  data  the  strategy  for  firing  rate  normalization 
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is  not  misleading.  There  are  three  common  methods  for  normalization:  (1)  The  JPSTH 
uses  the  geometric  mean  of  the  autocorrelations.  (2)  Bair  et  al  (2001),  (7.4-5),  and  Kohn 
&  Smith  (2005),  (7.4-24),  combine  the  shift  predictor  method  and  the  geometric  mean  of 
the  firing  rates.  (3)  The  NCC  (Neural  Correlation  Coefficient:  Abeles  (1982),  (7.4-1); 
Eggermont  (1992),  (7.4-12);  Hirabayashi  &  Miyashita  (2005),  (7.4-19)  uses  the  shift 
predictor  and  the  geometric  mean  of  the  autocorrelations.  All  are  conceptually  similar 
and  result  in  nearly  identical  quantities  and  behavior  under  a  diverse  range  of  conditions 
(static  and  dynamic  stimuli;  anesthetized  and  awake  animals;  cat  and  monkey;  Al,  VI, 
MT,  and  IT).  Normalization  based  on  the  product  of  the  firing  rates  is  generally  less 
effective  (Bair  et  al.  2001,  7.4-5).  Fig.7.2-4  shows  the  relative  consistency  of  the  three 


methods  across 


Orientations  (degree) 


Figure  7.2  -  4  Three  normalization  methods.  All  show  loss  of  synchrony  at  150°  despite 
high  firing  rate. 

summed  firing  rates  modulated  by  varying  stimulus  orientation.  The  JPSTH  is  the  most 
conservative  estimate  and  the  datum  at  150°  indicates  independence  from  firing  rate. 
Both  type  analysis  (Samonds  et  al  2003,  7.4-32)  and  gravitational  clustering  (Samonds  et 
al  2004,  7.4-33),  which  quantify  synchrony  in  wholly  different  ways,  yield  similar 
patterns  of  stimulus  dependence  of  cooperation  between  pairs.  Ito  &  Tsuji  (2000),  (7.4- 
22),  demonstrate  that  any  approach  to  normalization  can  be  flawed  without  considering 
the  underlying  models  of  spike  generation,  but  stress  that  there  is  no  approach  that  is 
universally  applicable  due  to  different  firing  models  in  different  cells,  or  even  within  a 
given  cell  under  different  conditions.  Within  this  context  they  conclude  that  Aertsen's 
JPSTH  model  is  useful  as  a  generalized  approach  to  analyzing  experimental  data. 

Nonstationary  firing  patterns.  Temporally  non-uniform  influences  on  firing  rate  can 
lead  to  spurious  peaks  in  the  cross-correlogram  that  do  not  necessarily  contribute  to  the 
cooperative  encoding  of  information.  There  is  no  definitive  solution  to  this  problem,  but 
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consideration  of  the  width  of  the  correlogram  peaks  lends  confidence  to  the  analysis. 
Covariation  of  both  latency  and  excitability  yields  peaks  that  do  not  directly  reflect 
instantaneous  spike  timing  covariation  (i.e.,  functional  cooperation)  between  neuron  pairs 
(Brody  1999a,b;  7.4-9,10).  However,  latency  or  excitability  covariation  generate  peaks 
that  are  on  a  broader  time  scale  (similar  to  PSTH  peaks)  than  those  from  spike  timing 
covariation  (Brody  1999b,  7.4-10).  Brody  cites  examples  of  correlogram  peaks  that  are 
"clearly  not  caused  by  excitability  covariations"  in  Ts'o  et  al  (1986),  (7.4-39),  which 
shows  peaks  of  a  width  not  exceeding  10  msec,  similar  to  all  of  the  peaks  that  we  report 
(e.g.,  Samonds  et  al  2006,  7.4-34).  As  an  example  of  the  selectivity  of  our  analysis,  in  a 
study  involving  grating  stimuli,  which  generate  synchrony  only  between  cell  pairs  with 
similar  orientations,  the  JPSTH  identified  only  79  of  784  possible  pairs  (10%)  as  showing 
significant  synchrony  (Samonds  et  al  2004,  7.4-33).  With  a  more  comprehensive  set  of 
stimuli  designed  to  generate  synchrony  in  both  collinear  and  cocircular  pairs,  we 
identified  significant  synchrony  in  188/631  co-responsive  pairs  (Samonds  et  al.  2006, 
7.4-34),  but  this  still  constituted  only  30%  of  the  total.  In  matches  between  circular 
stimuli  and  receptive  field  configurations,  only  2/307  pairs  (0.7%)  showed  significant 
synchrony  where  there  was  no  clear  relationship  between  the  stimulus  and  the  receptive 
fields  (i.e.,  false  positives).  We  are  confident  that  with  our  systematic  application  of  a 
comprehensive  set  of  stimuli,  restriction  of  analyses  to  correlation  peaks  of  10  msec  or 
less  and  comparisons  of  stimulus  patterns  with  receptive  field  maps,  we  have  neither 
over-  or  under-estimated  informative  synchrony  to  any  significant  degree  (see  7. 2. 3. 5). 

7.2.3.4  Correlation  between  JPSTH  and  Coherence  Analyses 

Synchronized  neural  responses,  which  often  are  accompanied  by  oscillations  in 
the  gamma  frequency  band,  exist  extensively  in  visual  cortex  and  are  proposed  as 
supporting  the  perception  of  visual  structure.  Neural  synchrony  and  oscillation  are 
normally  studied  with  cross-correlation  analysis  and  coherence  analysis  respectively, 
which  measure  the  association  between  neural  activity  in  the  time  and  frequency 
domains.  We  studied  responses  from  cat  visual  cortex  to  explore  the  relationship  between 
synchrony  and  coherency.  With  a  Cyberkinetics  10x10  microelectrode  array,  we  recorded 
66  complex  cells  from  areas  17  and  18  in  two  paralyzed  and  anesthetized  cats.  Drifting 
sinewave  gratings  (SF  =  0.5  cycle/0,  TF  =  2Hz,  Contrast  =  50%)  were  used  as  visual 
stimuli.  We  identified  694  pairs  that  showed  significant  synchrony  using  the  JPSTH 
representation  of  correlation.  We  also  studied  frequency  dependence  in  these 
synchronized  pairs  with  multi-taper  coherence  analysis  (Chronux  1.0).  The  multi-taper 
method  computes  the  averaged  spectral  estimates  by  applying  several  orthogonal 
windowing  functions  (i.e.  Slepian  functions).  Thus,  the  multi-taper  method  provides  a 
coherence  estimate  with  reduced  bias  and  variance  compared  to  direct  spectrum 
estimation  with  single  windowing  procedure.  Among  all  cell  pairs  that  exhibited 
significant  synchrony,  we  found  that  98.4%  data  samples  showed  coherence  values 
higher  than  the  95%  confidence  interval  at  certain  frequency  band(s)  (Figure  7.2-5[A  & 
B]). 
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Figure  7.2  -  5  (A)  Cross-correlation  analysis  with  JPSTH  method  shows  that  a  pair  of  cells 
exhibited  strong  synchrony  in  their  firings.  (B)  Coherence  analysis  on  the  same  cell  pair 
with  multi-taper  method  shows  that  the  frequency  dependence  between  the  two  spike  trains 


To  explore  the  relationships 
between  JPSTH  and  coherence 
analyses,  we  conducted  regression 
analyses  between  the  results  from 
these  two  methods.  We  integrated 
the  area  underneath  the  central  peak 
in  the  cross-correlogram  (CCH)  and 
the  area  between  the  estimated 
coherence  (red  trace  in  Figure  7.2- 
5B)  and  the  theoretical  95% 
confident  limit  (blue  line  in  Figure 
IB).  Linear  regression  analysis 
(Figure  7.2-6)  reveals  strong 
correlation  between  neural 


Figure  7.2  -  6  Scatter  plot  of  integrated  synchrony  and  integrated  coherence. 

corresponding  coherence  (R2  =  0.63),  which  validates  the  normalization  implemented  in 
the  JPSTH.  Our  results  suggest  that  cross-correlation  analysis  and  coherence  analysis  are 
internally  related,  though  these  two  methods  study  neural  connectivity  from  wholly 
different  perspectives. 


mainly  exists  in  a  frequency  band  of  25-65  Hz. 
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7.2.3.5  Temporal  Structure  and  Neural  Correlation 

The  fine  temporal  structure  in  neural  spikes  cooperates  with  the  average  response 
rate  of  visual  neurons  to  provide  visual  information,  though  these  two  variables  have 
different  roles  in  coding  specific  properties.  We  recorded  66  complex  cells  from  areas  17 
and  18  in  two  paralyzed  and  anesthetized  cats  under  the  stimulation  of  drifting  sinewave 
gratings  (SF  =  0.5  cycle/0,  TF  =  2Hz,  Contrast  =  50%).  A  total  of  694  pairs  showed 
significant  synchrony  quantified  by  cross-correlation  histogram  (CCH)  derived  by  JPSTH 
method.  The  frequency  dependence  of  these  synchronized  pairs  was  further  quantified 
with  multi-taper  coherence  analysis.  We  applied  a  spike  jittering  method  with  different 
time  distributions  (±5ms,  ±10ms,  and  ±20ms)  to  perturb  the  timing  accuracy  in  the  spike 
trains  at  different  levels  to  see  how  synchrony  and  coherence  depend  on  fine  structure. 
We  found  that  neural  synchrony  systematically  dropped  with  the  increase  of  jitter  time. 
Figure  7.2-7  shows  an  example  of  synchrony  modulation  with  spike  jittering.  Synchrony 
dramatically  decreased  after  spikes  were  jittered  within  a  ±10ms  or  ±5ms  time  range. 
Coherence  analysis  on  the  same  pair  of  cells  shows  that  the  frequency  dependence  also 
decreased  with  the  increase  of  the  spike  jittering  range.  We  found  that  strong  coherence 
between  the  unjittered  spike  trains  mainly  existed  in  the  gamma  band  (40  -  70  Hz)  with 
the  highest  coherence  value  (coherence  =  0.26)  located  at  52  Hz  (Figure  7.2-8A).  After 
the  spikes  of  both  cells  were  randomly  jittered  across  a  ±5ms  range,  coherence  strength 
(red  area  between  the  coherence  curve  and  95%  confidence  interval)  dropped 
dramatically  (Figure  7.2-8B).  When  the  two  spike  trains  were  jittered  across  broader  time 
ranges  (±10ms  and  ±20ms),  the  frequency  dependence  between  these  two  cells  were 
almost  completely  destroyed  (Figure  7.2-8C&D). 

The  reduction  of  temporal  and  frequency  dependence  after  spike  jittering  in  the 
above  example  also  applies  to  our  entire  population.  We  averaged  the  synchrony  and 
coherence  data  for  all  synchronized  pairs  we  identified.  The  average  synchrony  is  highest 
before  spike  trains  were  jittered.  The  central  peak  of  the  averaged  CCH  before  spike 
jittering  has  an  amplitude  of  1.16%.  After  the  spikes  were  jittered  randomly  with 
increased  time  ranges  (±5  ms,  ±10  ms,  and  ±20  ms),  synchrony  gradually  reduced  to 
0.82%,  0.66%,  0.56%  respectively  with  statistical  significance  (paired  student’s  t-test, 
p<10-10).  Meanwhile,  the  width  of  the  central  peak  on  the  averaged  CCHs  also  increased 
when  the  spikes  were  jittered  across  a  broader  time  range,  indicating  a  disruption  of  the 
temporal  relationship  between  cells.  We  also  found  that  the  average  coherence  spectrum 
for  the  original  spike  trains  has  its  greatest  power  in  the  low  frequency  band  and  the 
second  peak  in  the  coherence  curve  is  found  around  45  Hz  (Figure  7.2-9A).  The  power  in 
the  coherence  spectrum  gradually  decreased  with  the  increase  of  the  spike  jittering  range. 
When  spikes  were  jittered  in  the  ±5ms  range,  the  second  coherence  peak  disappeared, 
and  when  spikes  were  jittered  in  the  ±10  or  ±20ms  range,  the  coherence  trace  in  the 
gamma  range  flattened. 
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Figure  7.2-7  A  pair  of 
cells  with  similar 
orientation  preferences 
shows  synchronized 
firing  with  stimulation 
of  moving  sinwave 
gratings.  Synchrony 
systematically  reduced 
after  spike  timings  in 
the  original  spike  train 
were  jittering  in 
increasing  time  range. 
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Figure  7.2  -  8  Coherence 
analysis  shows  that  the 
frequency  dependence 
between  cells  also 
reduces  after  spike 
timings  were  jittered  in 
increasing  time  ranges. 
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Figure  7.2  -  9  Population  averaged  spectrum  shows  that  coherence  dropped  after  spikes 
were  jittered  while  the  gamma  coherence  exhibited  greatest  loss  compared  to  that  of  other 
frequency  bands. 

To  explore  how  much  frequency  dependence  is  embedded  in  the  fine  temporal 
structure,  we  derived  coherence  modulation  (Am(f)=[C(f)-Cbi(f)]/  Cbi(f)xl00%)  of 
different  timing  accuracies  with  respect  to  the  “baseline”  coherence  (i.e.,  coherence 
expected  in  the  absence  of  fine  coordination),  which  is  defined  by  the  average  coherence 
for  spike  trains  jittered  in  the  ±20ms  range.  We  integrated  the  areas  below  the  modulation 
curves  with  different  jittering  ranges  (no  jittering,  ±5  ms  jittering,  or  ±10  ms  jittering), 
and  compared  the  modulation  loss  among  alpha  (5-15  Hz),  beta  (15-40  Hz),  and 
gamma  bands.  In  the  alpha  band,  the  integrated  area  was  reduced  by  9%  or  34% 
respectively  after  ±5  ms  or  ±10  ms  jittering.  In  the  beta  band,  reduction  was  35%  or  81% 
respectively  after  ±5  ms  or  ±10  ms  jittering.  In  the  gamma  band,  however,  the  reduction 
was  81%  or  97%  after  ±5  ms  or  ±10  ms  jittering.  Our  results  suggest  that  the  fine 
temporal  structure  in  neural  spike  trains  is  critical  in  maintaining  the  frequency 
dependence  between  neurons,  and  that  the  frequency  band  that  is  most  sensitive  to  the 
integrity  of  the  temporal  structures  in  the  neural  spike  trains  is  the  gamma  band.  Since 
the  global  statistics  (average  firing  rate,  interspike  intervals)  of  the  spike  train  are  on 
average  preserved  by  jittering,  we  conclude  that  the  synchrony  and  spectral  correlations 
result  from  active  processes  and  not  artifact. 

7.2.3.6  Neural  Cooperation  is  Dependent  on  Spatial  Coherence 

Synchronization  of  neural  responses  is  a  stimulus-dependent  phenomenon.  Neural 
synchrony  can  be  reduced  by  spatial  disorganization  or  discontinuity  in  stimuli,  but  there 
has  been  no  systematic  study  of  how  synchrony  depends  on  coherent  contours,  as 
opposed  to  general  spatial  coherence  (i.e.  motion  as  a  unit).  We  studied  the  response 
(synchrony  and  coherence)  modulation  in  197  cell  pairs  corresponding  to  the  perturbation 
of  spatial  coherence  in  the  visual  stimuli.  The  unperturbed  stimulus  was  drifting  sinewave 
gratings  (SF  =  0.5  cycle/0,  TF  =  2Hz,  Contrast  =  50%).  The  spatial  coherence  in  the 
stimulus  was  systematically  perturbed  by  randomly  swapping  pixel  areas  (0.13  x  0.13 
degree2)  in  the  stimulus  (Figure  7.2-10A).  We  found  that  the  average  synchrony  and 
coherence  systematically  dropped  with  a  decrease  of  spatial  coherence  (increase  of 
swapped  pixels),  while  the  gamma  band  (40-70  Hz)  coherence  displayed  faster  reduction 
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than  either  raw  synchrony  or  that  in  the  alpha  (5-15  Hz)  and  beta  (15-30  Hz)  bands 
(Figure  7.2-10  B&C).  The  gamma  oscillation  thus  appears  to  be  the  most  sensitive 
indicator  of  image  coherence. 


Swapping  Probability  (%) 


Figure  7.2  - 10  (A)  Top:  Sinewave  gratings  with  coherent  structure;  Bottom:  A  perturbed 
stimulus  with  40%  pixels  being  swapped  randomly.  (B)  The  average  synchrony  dropped 
with  a  higher  percentage  of  swapped  pixels.  (C)  Gamma  coherence  decreased  faster  than 
alpha  and  beta  coherence  as  the  spatial  coherence  is  perturbed. 

To  explore  the  evolution  of  neural  cooperation,  we  studied  the  temporal 
development  of  neural  synchrony  and  coherence  by  deriving  2-dimensional  CCHs  and 
coherence  plots  using  a  500ms  moving  window  with  50ms  step  resolution  (Figure  7.2-1 1 
A&B).  The  average  synchrony  reaches  a  maximum  at  nearly  the  beginning  of  the 
stimulus  (50ms  after  stimulus  onset),  but  the  gamma  coherence  reaches  the  maximum 
much  later  (700ms  after  stimulus  onset). 


Time(mesc)  Time  (mesc) 


Figure  7.2-11  (A)  Population  averaged  2-d  CCH  (B)  Population  averaged  2-d  coherence 
plot 


Our  results  suggest  that  the  spike  timing  synchronization  is  induced  nearly 
instantaneously  by  the  coherent  structure  in  the  visual  stimuli  while  frequency-dependent 
synchronization  develops  more  slowly,  presumably  through  network  interactions. 
Gamma  synchrony  would  appear  to  signify  a  cooperative  representation  of  image  detail 
that  cannot  be  achieved  through  the  more  basic  spike  synchronization  of  cell  pairs. 
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7. 2.3.7  Contour  Quantification  and  Analysis 

Synchrony  has  been  studied  among  pairs  of  neurons  with  artificial  stimuli 
(gratings,  rings),  but  its  behavior  among  larger  assemblies  in  the  context  of  natural  vision 
remains  unknown.  One  challenge  in  analyzing  responses  to  natural  stimuli  lies  in  linking 
responses  to  a  specific  stimulus  attribute.  We  have  recently  found  that  synchrony  may 
encode  contours  represented  by  artificial  ring  stimuli  (Samonds  et  al  2006,  7.4-34)  and 
wanted  to  know  if  this  finding  extrapolates  to  complex  contours  in  natural  images.  As  a 
preliminary  test,  we  derived  a  new  measure  for  detecting  and  quantifying  cooperative 
firing  within  neural  assemblies  of  arbitrary  size  and  used  this  measure  to  describe  how 
synchrony  changes  with  different  structural  modifications  of  the  same  natural  image.  We 
found  that  degradation  of  image  structure  (contours)  by  noise  or  low-pass  filtering  was 
represented  more  strongly  by  synchrony  than  by  average  firing  rate  in  large  assemblies. 

To  investigate  the  role  of  synchrony  as  a  contour-encoding  mechanism  in  natural 
vision,  we  examined  the  particular  spatial  structure  that  might  provide  the  ideal  stimulus 
for  an  assembly.  In  a  method  adapted  from  Smyth  et  al  (2003),  (7.4-38),  and  Weliky  et  al 
(2003),  (7.4-44),  we  presented  a  natural  image  sequence  and  identified  images  that 
elicited  a  synchronized  response  in  a  given  group.  Orientation  and  spatial  frequency 
components  filtered  by  the  measured  selectivity  and  spatial  localization  of  each  cell  in  the 
assembly  were  extracted  from  each  image,  summed  together  and  weighted  by  the 
magnitude  of  the  response.  The  resulting  composite  synthesized  a  stimulus  attribute  seen 
in  each  picture  or  combined  different  structural  features  to  create  a  new  attribute  that  was 
ideal  for  the  group. 

As  a  preliminary  finding  (Figure  7.2-12),  we  have  applied  the  analysis  to  an 
assembly  of  4  cells.  Orientation  and  spatial  frequency  components  spatially  matching  the 
preferences  of  the  cells  were  extracted  from  the  ten  images  with  the  highest  synchrony 
response  and  the  ten  images  with  the  highest  average  firing  rate  response  for  the  group. 
The  extracted  features  were  weighted  by  the  magnitude  of  the  response  and  summed  to 
produce  the  final  composite  images.  We  found  that  the  receptive  fields  of  the  cells  were 
aligned  on  a  complex  contour  in  the  image  with  the  highest  synchronized  response. 
Furthermore,  spline  and  receptive  field  analyses  revealed  that  each  complex  contour  was 
relatively  conserved  across  images  with  similarly  high  synchronized  responses.  In 
contrast,  contours  were  not  conserved  in  the  images  generating  the  highest  average  firing 
rate  across  the  assembly. 

The  image  sequence  included  2  null  stimuli  and  336  images,  each  rotated  at  9 
angles  (40°  to  360°,  in  40°  increments),  for  an  effective  stimulus  set  of  3026  pictures. 
Each  image  was  18°  x  18°.  The  images  were  displayed  in  a  random  order  for  0.5  sec 
followed  by  0.5  sec  of  mean  luminance.  The  entire  sequence  was  shown  10  times  (in 
different  random  orders).  Pictures  were  scaled  to  have  the  same  global  contrast.  We 
recorded  from  75  cells  in  the  visual  cortex  of  two  paralyzed  and  anesthetized  cats  and 
randomly  identified  4500  neural  assemblies,  ranging  in  size  from  2-10  cells  with  500 
assemblies  per  size  group.  In  practice,  we  have  found  that  any  arbitrary  group  of  8  cells 
or  less  will  reliably  synchronize  to  some  of  the  3026  images,  and  some  groups  of  9-10 
cells  will  repeatedly  synchronize  to  a  few  (e.g.,  3/3026)  particularly  complex  images. 

These  results  support  the  role  of  synchrony  as  a  contour-encoding  mechanism. 
As  a  check  on  the  validity  of  these  predicted  trigger  features,  we  will  filter  all  3026 
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Figure  7.2-12  Natural  images  with  the  largest  synchrony  and  firing 
rate  responses  (normalized  to  the  maximum  for  each  cell)  for  a  given 
assembly  of  4  neurons  whose  receptive  fields  and  preferred 
orientations  are  outlined.  To  the  right  of  each  image  are  the  extracted 
features  (contrast  enhanced  2.5x)  for  the  assembly.  Note  that  the 
extracted  components  with  large  synchrony  responses  tend  to 
describe  complex  contours  in  the  images.  The  top  ten  extracted 
images  were  summed  and  weighted  by  the  magnitude  of  their 
response  to  produce  the  final  images  (contrast  enhanced  lOx) 


images  with  the  trigger  feature  associated  with  each  assembly  and  compute  a  score  for 
each  image  based  on  the  resulting  energy.  A  threshold  score  will  be  determined  from 
those  images  known  to  generate  synchrony  above  the  chance  level.  Some  images  that  did 
not  trigger  synchrony  in  a  given  assembly  may  yield  energy  above  the  threshold  level, 
indicating  cases  in  which  the  assembly  "missed"  a  putatively  effective  feature.  The 
control  analyses  will  help  to  confirm  whether  or  not  linear  summations  are  valid.  They 
may  also  demonstrate  the  effects  of  contour  occlusion.  Naturally-occurring  occlusions 
across  continuous  contours  may  disrupt  synchrony,  which  would  show  up  as  "missed" 
features  in  the  control  analyses. 

From  data  collected  using  the  same  stimulation  protocol  as  above,  we  will 
perform  a  cluster  analysis  to  determine  how  each  assembly  classified  the  natural  image 
sequence.  Hierarchical  clustering  (similar  to  Rolls  &  Tovee,  1995,  7.4-31)  will  be 
performed  on  the  response  correlation  matrix,  which  contains  the  correlations  between 
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the  pairs  of  all  responses  to  the  image  sequence  for  each  assembly.  Images  within  the 
same  cluster  will  be  analyzed  for  contour  content  by  fitting  a  spline  to  large  gradients  of 
contrast  information  in  the  image  near  the  focus  of  the  assembly.  A  score  will  be 
determined  for  each  cell  based  on  the  degree  of  fit  of  the  cell’s  preferred  orientation  to  a 
line  tangent  to  the  spline  and  receptive  field  distance  from  the  spline.  The  mean  score  for 
the  assembly  represents  a  way  to  quantify  the  similarity  between  the  assembly 
configuration  and  the  contour  content  of  the  stimulus. 

7.2.3.8  Dependence  of  Group  Synchronization  on  Contour  Structure 

Our  brains  process  and  interpret  sensory  information  in  order  to  generate 
perceptions  of  the  environment  or  motivate  behavior.  However,  the  underlying 
mechanisms  by  which  salient  stimulus  features  are  represented  in  neuronal  responses 
remain  unresolved.  We  have  previously  shown  that  synchrony  between  cell  pairs  can 
represent  curvilinear  contours  (Samonds  et  al  2006,  7.4-34),  suggesting  that  synchrony 
within  larger  assemblies  may  be  involved  in  encoding  more  complex  contours. 
Extending  assembly  formation  to  cells  with  spatial  relationships  that  are  not  collinear  is 
vital  in  establishing  whether  synchrony  might  serve  as  a  general  mechanism  for  contour 
integration  and  shape  detection. 

To  investigate  the  role  of  synchrony  as  a  contour-encoding  mechanism  in  natural 
vision,  we  measured  the  synchronous  responses  of  large  neural  assemblies  to  a  sequence 
of  natural  images.  By  correlating  assemblies  and  image  elements,  we  found 
circumstantial  evidence  to  suggest  that  the  specific  feature  that  best  drives  a  synchronous 
assembly  is  a  continuous,  well-defined  contour.  For  a  more  thorough  investigation,  we 
quantitatively  related  the  contour  content  of  images  to  the  spatial  organization  of  the  cells 
composing  the  synchronous  assembly.  Since  measuring  properties  of  contours  in  images 
(e.g.  degree  of  presence  or  absence)  is  subjective  and  depends  on  the  amount  of  blur, 
contrast,  occlusion,  etc.,  we  chose  to  measure  how  well  a  contour  structure  matches  the 
assembly  configuration  and  then  related  the  assembly  configuration  to  the  image. 

We  used  criteria  from  the  association  field  theory  (Field  et  al.  1993,  7.4-15;  Hess 
et  al.  2003,  7.4-18)  to  quantify  how  well  an  assembly  fits  contour  structure.  Association 
field  theory,  which  is  based  on  studies  of  the  perception  of  contours  and  continuity, 
predicts  linking  between  orientation- tuned  cells  that  is  dependent  on  their  joint  relative 
orientation  and  spatial  position.  The  probability  for  linking  is  strongest  between 
contiguous  elements  with  shallower  relative  angles  and  closer  separations.  Using 
measures  that  represent  the  distance  between  neighboring  cells,  the  difference  in 
orientation  preferences,  and  contiguity,  we  created  a  three-dimensional  vector  (R),  the 
length  of  which  is  inversely  proportional  to  the  degree  of  association  (see  Figure  7.2- 
13 A).  We  defined  the  Contour  Index,  C,  as  a  reversed,  normalized  version  of  R  which 
represents  the  degree  to  which  an  assembly  configuration  fits  a  contour  structure. 
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Figure  7.2  - 13  A.  Measure  to  quantify  how  well  an  assembly  fits  contour  structure  using 
criteria  from  the  Association  Field  Theory.  B.  The  receptive  fields  of  two  assemblies 
superimposed  on  the  images  with  the  largest  synchrony  response  and  feature  matrices  depicting 
the  local  contrast  structure  in  the  image  to  which  the  assembly  is  responding  C.  Bar  plot  shows 
normalized  synchrony  and  average  firing  rate  responses  for  the  assemblies  depicted  in  B.  The 
scatter  plot  shows  synchrony  and  average  firing  rate  versus  Contour  Index  for  assemblies 
chosen  with  a  range  of  C  values  (N  =  14).  For  each  assembly,  we  measured  responses  across 
1513  stimuli  and  chose  the  image  with  the  largest  synchrony  response. 

To  determine  how  well  the  spatial  organization  of  the  assembly  matched  an 
image,  we  used  the  assembly's  average  firing  rate  response.  Pooled  cell  activity  is 
directly  linked  to  local  contrast  structure  (Weliky  et  al  2003,  7.4-44),  so  the  degree  of 
response  is  a  reflection  of  how  well  the  orientation  and  spatial  frequency  components  in 
the  image  match  the  tuning  properties  of  each  cell  in  the  assembly.  Figure  7.2-  13B 
shows  the  receptive  fields  of  the  cells  comprising  two  assemblies  and  the  corresponding 
images  which  generated  the  largest  synchrony  responses.  The  amount  of  response  can  be 
visualized  by  creating  a  local  contrast  structure  feature  matrix  by  extracting  orientation 
and  spatial  frequency  components  in  the  image  that  match  the  tuning  properties  of  each 
cell  in  the  assembly  (Weliky  et  al  2003,  7.4-44). 

In  this  example,  each  assembly  has  a  large  average  firing  rate  response  (see 
Figure  7.2- 13C  bar  plot),  but  assembly  1  also  had  a  large  synchrony  response.  The 
configuration  of  assembly  1  is  more  similar  to  a  contour  (higher  Contour  Index),  which 
means  the  contrast  structure  to  which  it  is  responding  is  more  likely  to  be  arranged  in  a 
contour  as  compared  to  assembly  2.  Results  from  twelve  additional  assemblies  also 
reveal  that  synchrony  is  directly  correlated  with  the  Contour  Index  (see  Figure  7.2- 13C 
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scatter  plot).  These  findings  suggest  that  contoured  assemblies  synchronize  to  similarly 
arranged  local  contrast  structure  and  provide  quantitative  evidence  that  synchrony  signals 
contiguous,  well-organized  contours  in  natural  images. 

7.2.3.9  Sparse  Coding  as  a  Framework  for  Information  Processing  in 
the  Brain 

Although  current  technology  allows  us  to  monitor  the  interactions  of  hundreds  of 
neurons  simultaneously,  we  cannot  understand  their  contribution  to  visual  perception 
without  a  conceptual  framework  that  describes  the  strategy  for  information  processing. 
We  investigated  the  selectivity  of  synchrony  in  large  assemblies  in  the  context  of  sparse 
coding,  where  information  is  represented  by  a  small  subset  of  cells  from  the  population. 
Population  sparseness  describes  the  response  probability  of  the  cell  population  at  one 
instance  in  time.  The  distribution  of  responses  is  leptokurtic  with  a  large  peak  at  zero  (or 
chance),  indicating  that  most  of  the  population  is  inactive.  Lifetime  sparseness  describes 
the  activity  of  one  neuron  (or  one  assembly)  in  response  to  the  set  of  all  input  images 
over  time  (Willmore  &  Tolhurst  2001,  7.4-45)  also  yielding  a  leptokurtic  response 
distribution  with  a  peak  at  zero  (or  chance).  Across  all  input  images,  all  cells  have  an 
equal  response  probability  but  have  a  low  response  probability  for  any  single  image. 
With  sparseness,  the  dimensionality  is  not  reduced,  but  the  redundancy  in  the  input  is 
transformed  into  the  redundancy  in  the  firing  pattern  of  the  cells  (Field  1994,  7.4-16).  In 
this  manner,  information  about  the  environment  is  distributed  across  all  cells  and  objects 
are  represented  by  which  cells  are  active  and  not  by  the  relative  activity  or  overall 
amount  of  activity  of  a  specific  subset. 

This  approach  has  been  found  to  be  consistent  with  the  representations  of  natural 
scenes  (Field  1987,  1989,  1994,  7.4-13,14,16;  Zetzsche  1990,  7.4-50).  The  degenerate 
mapping  of  highly  correlated  stimuli  to  a  sparse  representation  is  described  as  a  fault- 
tolerant  method  of  reliably  learning  and  discriminating  closely-related  patterns  (Leonardo 
2005,  7.4-26).  In  the  primary  visual  cortex  of  primates,  neurons  produce  (lifetime)  sparse 
responses  when  stimulated  with  natural  image  sequences  (Vinje  and  Gallant  2000;  2002, 
7.4-42,43)  and  high  lifetime  and  population  sparseness  is  found  in  VI  of  ferret  (Weliky  et 
al  2003,  7.4-44).  Furthermore,  responses  are  most  sparse  when  the  non-classical 
receptive  field  is  stimulated  implying  that  sparseness  is  modulated  by  context.  Several 
findings  also  suggest  the  visual  system  may  be  optimized  for  processing  the  statistics  of 
natural  scenes  (Barlow  1961,  7.4-6;  Kersten  et  al  1987,  7.4-23;  Simoncelli  2003,  7.4-36) 
by  employing  a  sparse  coding  strategy  (Olshausen  and  Field  1996,  7.4-28;  Bell  and 
Sejnowski  1997,  7.4-8;  van  Hateren  and  van  der  Schaaf  1998,  7.4-41;  van  Hateren  and 
Ruderman  1998,  7.4-40;  Hyvarinen  and  Hoyer  2000,  7.4-20). 

There  are  several  biological  advantages  for  employing  a  sparse  coding  strategy. 
For  instance,  by  utilizing  a  small  number  of  neurons  during  any  one  task,  sparse  codes 
are  energy  efficient.  Estimates  of  the  energy  required  for  signaling  in  cortical  neurons 
suggest  that  less  than  2%  of  the  population  can  be  significantly  active  at  any  given  time 
(Attwell  and  Laughlin  2001,  7.4-3;  Lennie  2003,  7.4-25).  Experimental  evidence  has 
shown  that  average  firing  rates  are  low  for  natural  images  compared  to  that  from  optimal 
grating  stimulation,  which  reduces  the  metabolic  demands  of  visual  processing  (Baddeley 
et  al  1997,  7.4-4;  Guo  et  al  2005,  7.4-17).  Also,  several  theoretical  and  computational 
studies  have  shown  that  sparse  representations  are  most  effective  for  storing  patterns  in 
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associative  memory  models  (Willshaw  et  al  1969,  7.4-47)  and  they  are  advantageous  for 
learning  associations  in  neural  networks  (Palm  1980,  7.4-30;  Baum  et  al  1988,  7.4-7; 
Zetzsche  1990,  7.4-50;  Field  1994,  7.4-16).  Furthermore,  Hoyer  and  Hyvarinen  (2002), 
7.4-21,  have  shown  that  a  multi-layer  sparse  coding  network  is  capable  of  learning 
contour  coding  from  natural  images  in  an  unsupervised  fashion. 

While  sparse  coding  requires  a  subset  of  cells  to  be  active  for  any  given  input, 
what  properties  of  the  input  define  the  subsets  formed?  We  hypothesized  that  this 
representation  is  reflected  in  the  cooperative  activity  of  the  subset.  In  other  words, 
synchrony  is  a  way  to  implement  a  sparse  coding  strategy  where  higher-order  stimulus 
information  is  represented  in  the  precise  temporal  pattern  of  a  neural  assembly.  Precise 
temporal  coordination  among  neurons  preserves  the  requirements  for  sparse  coding  in 
that  information  is  transmitted  efficiently  (through  cooperation)  and  by  a  few  neurons 
(assembly). 

We  assessed  the  selectivity  of  synchrony  on  the  basis  of  the  probability 
distribution  of  such  responses  across  presentation  of  numerous  natural  images.  We 
measured  the  sparseness  of  synchrony-based  and  firing  rate-based  coding  strategies  using 
a  natural  image  sequence  of  3024  pictures.  Groups  of  2-8  cells  were  chosen  for  analysis. 
We  tested  and  compared  6  sparseness  metrics  that  can  be  alternatively  computed  for 
population  or  lifetime  analysis  by  changing  2  variable  definitions.  We  used  3  metrics 
defined  by  Olshausen  &  Field  (1997),  7.4-29,  kurtosis,  the  Rolls  &  Tovee  (1995),  7.4-31, 
measure  as  modified  by  Willmore  &  Tolhurst  (2001),  7.4-45,  and  the  "activity  (or 
response)  sparseness"  measure  (Willmore  &  Tolhurst  2001,  7.4-45). 

We  found  that  synchrony  response  distributions  (Figure  7.2- 14A)  were  leptokurtic 
and  the  kurtosis  increased  nearly  exponentially  with  group  size  (2  cells:  Si  =  0.957,  4 
cells:  Si  =  9.13,  6  cells:  Si  =  27.7,  8  cells:  Si  =  68.9).  The  kurtosis  for  firing  rate 
response  distributions  (2  cells:  Si  =  0.149,  4  cells:  Si  =  0.249,  6  cells:  Si  =  0.484,  8  cells: 
Si  =  0.889)  did  not  change  as  a  function  of  group  size  (p  >  0.6).  Using  the  response 
sparseness  metric  (Figure  7.2-4B),  we  found  that  assemblies  of  2  cells  had  significant 
cooperative  activity  for  10.6%  of  the  stimulus  sequence  (321  images)  and  assemblies  of  8 
cells  had  significant  cooperative  activity  for  4.1%  (124  images).  However,  assemblies  of 
2  cells  had  significant  average  firing  rate  activity  for  14.9%  of  the  stimulus  sequence 
(451  images)  and  assemblies  of  8  cells  had  significant  average  firing  rate  activity  for 
8.4%  (255  images).  Overall,  our  results  suggest  that  increasing  assembly  size  enhances 
the  ability  of  synchronous  assemblies  to  discriminate  among  the  natural  image  sequence 
more  than  a  firing  rate  code.  This  was  true  for  all  6  sparseness  metrics  tested  and  for 
both  lifetime  and  population  responses. 
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A-  Synchrony  Distributions 
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Figure  7.2-14  (A)  Average  kurtosis  for  synchrony  and  firing  rate  response  distributions 
with  2,  4,  6,  and  8  cells  per  assembly.  (B)  Response  Sparseness  is  computed  as  the  number  of 
images  to  which  the  assemblies  did  not  respond  (top;  bottom  plot  shows  [1-response 
sparseness]  and  represents  the  number  of  images  with  responses  that  were  1.96(stdev) 
above  chance).  (C)  Lifetime  and  (D)  Population  sparseness  as  a  function  of  assembly  size 
using  the  last  4  metrics  in  (E).  Population  sparseness  describes  the  response  distribution  of 
many  assemblies  to  1  image.  Lifetime  sparseness  describes  the  response  distribution  of  1 
assembly  to  many  images.  S3  is  the  formula  proposed  by  Rolls  and  Tovee  (1995),  31,  as 
modified  by  Willmore  and  Tolhurst  (2001),  7.4-45.  S4-S6  are  measures  provided  by 
Olshausen  and  Field  (1997),  7.4-29  and  Willmore  et  al.  (2000),  7.4-46. 
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7.2.3.10  Conclusions  -  Bio-Optics  of  Vision 

Current  models  of  visual  processing  (form  extraction,  contour  detection)  are 
based  largely  on  the  activity  of  individual  filters  (neurons)  as  opposed  to  population 
activities.  This  is  mainly  a  consequence  of  the  technical  challenges  of  recording 
simultaneously  from  larger  populations  of  neurons.  Our  laboratory  was  routinely 
successful  in  recording  from  >60  neurons  simultaneously 

We  have  pursued  validation  of  the  Binding  By  Synchrony  hypothesis,  in  which 
complex  visual  structures  are  integrated  by  the  synchronization  of  cellular  assemblies  on 
a  msec  scale.  We  have  found: 

♦  The  formation  of  larger  (>2)  synchronized  assemblies  occurs  much  more  often  than 

would  be  predicted  by  chance 

♦  Synchronization  is  stimulus-dependent  and  highly  selective 

♦  The  analytical  methods  used  to  quantify  synchrony  (JPSTH  and  Coherency)  are  reliable 

and  internally  consistent 

♦  Synchrony  is  highly  dependent  on  fine  temporal  structure  in  spike  trains  and  appears 

causal,  in  that  reconstruction  of  sequences  with  identical  overall  firing 
characteristics  nearly  destroys  all  synchrony 

♦  Synchrony  is  highly  dependent  on  spatial  coherence  within  an  image 

♦  Synchrony  appears  to  be  encoding  contiguous  contours  of  similar  orientations 

♦  Sparse  coding  via  synchrony  appears  to  be  an  efficient  and  viable  substrate  for  the 

representation  of  coherent  visual  structures 


109 


7.2.3.11  Bio-Electrode  Development 

In  a  related  area,  we  have  discovered  a  process  for  creating  a  nanodiamond 
electrode  of  a  new  and  novel  nanostructure,  denoted  as  a  “Coral”  Carbon  Microprobe, 
Figure  7.2-15.  These  “coral”-type  microstructure  carbon  films,  with  and  without  N2 
incorporation,  were  grown  on  sharpened  tungsten  microprobes  by  plasma  enhanced 
chemical  vapor  deposition  (PECVD)  using  H2/CH4/N2  and  H2/CH4  gas  mixtures, 
respectively.  The  electrochemical  behavior  of  the  coral-type  carbon  coated  tungsten 
microprobe,  characterized  in  various  concentrations  of  ferrocyanide  in  a  background  of 
0.1M  KC1,  show  excellent  structural  stability  with  similar  microstructure  before  and  after 
prolonged  analysis  without  the  need  of  surface  pretreatment.  The  microprobes  exhibit 
quasi-reversible  kinetics  with  high  signal-to-noise  S/B  ratio.  The  N2  incorporated 
microprobe  shows  a  slightly  wider  potential  window,  no  surface  adsorption  of  the  analyte 
and  higher  sensitivity  as  compared  to  the  sample  without  nitrogen  incorporation. 
Furthermore,  the  wide  potential  window  of  ~3V  is  very  good  comparing  to  the  boron- 
doped  highly  quality  diamond  which  has  -3.5V.  The  good  electrochemical  behavior  and 
simple  fabrication  method  make  the  “coral”  carbon  film  microprobes  an  excellent 
candidate  for  electrochemical  sensing. 


Figure  7.2-15  SEM  of  “coral”  type  nanodiamond  grown  on  tungsten  needle  substrate  for 
bio-electrode  application 

We  proceeded  with  detail  characterization  of  the  novel  “coral”-type  nanodiamond 
electrode.  Specifically,  the  electrode  performance  of  the  undoped  and  nitrogen 
incorporated  nanodiamond  films  was  begun 
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Figure  7.2. -16  shows  the  SEM  micrographs  of  “coral”-type  microstructure  carbon 
films  grown  on  sharpened  tungsten  microprobes  by  plasma  enhanced  chemical  vapor 
deposition  (PECVD)  using  H2/CH4  and  H2/CH4/N2  gas  mixtures  that  give  rise  to  the 
undoped  (sample  1)  and  N2  incorporated  (sample  2)  carbon  microprobes,  respectively. 
Sample  2,  which  has  N2  incorporation,  has  a  smoother  surface  morphology  as  compared 
to  that  of  the  undoped  sample  1 . 


Figure  7.2-16  SEM  micrographs  of:  (a)  sample  1  (carbon  film  without  nitrogen)  and  (b) 
sample  2  (carbon  film  with  nitrogen  ) 

However,  the  corresponding  high  magnification  micrographs,  Figure  7. 2.-17,  show 
similar  microstructures  for  both  the  samples  resembling  naturally  occurring  coral.  The 
EDS  spectra  confirm  that  the  films  are  indeed  carbon  and  show  the  presence  of  nitrogen 
in  the  N2  incorporated  carbon  film. 


Figure  7.2-17  High  magnification  SEM  micrographs  of:  (a)  sample  1  (carbon  film  without 
nitrogen)  and  (b)  sample  2  (carbon  film  with  nitrogen  ) 
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The  microprobes  were  used  to  establish  the  potential  window  and  background 
current  in  0.1M  KC1  at  a  scan  rate  of  100  mV/s.  The  voltammograms  seen  in  figure  7.2.- 
7  indicate  a  wide  potential  window  and  low  background  current  which  makes  them 
candidates  for  detection  of  a  wide  variety  of  analytes  with  a  high  signal  to  background 
ratio.  Furthermore,  sample  2  (N2  incorporated  carbon  film)  shows  a  slightly  wider 
window  of  ~3V  as  compared  to  -2.8V  for  sample  1  (no  nitrogen).  These  values  are 
impressively  high,  considering  ‘normal’  boron  doped  diamond  has  a  -3.5V  potential 
window  (Xu  J.  (1997),  (1998),  7.4-48,  49). 


Figure  7.2-18  Cyclic  voltammograms  obtained  from  Sample  1  (Undoped  carbon  film)  and 
Sample  2  (N2  incorporated  carbon  film)  in  0.1M  KC1  at  a  scan  rate  of  lOOmV/s 

Concluding  the  Bio-Electrode  Development,  the  “coral”  carbon  films  coated 
tungsten  microprobe,  with  and  without  N2  incorporation  exhibited  excellent 
electrochemical  behavior.  Both  the  undoped  and  N2  incorporated  carbon  film 
microprobes  have  very  similar  microstructure,  even  though  the  latter  showed  smoother 
surface  morphology.  The  electrochemical  behavior  was  under  further  evaluation.  As  the 
funding  level  was  reduced  for  the  Vanderbilt  subtask  in  Year  2,  the  task  re:  “Bio-Optic 
Sensor  Electrode  Development”  took  a  diminished  role  and  certain  other  tasks  were 
reduced. 
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7.3  Research  nanotechnology  capabilities  for  enhancing 
imaging  sensor  performance ,  including  advanced  high  definition 
infra-red  displays 

7.3.1  Overview 

A  need  exists  for  an  emitter  in  the  infrared  and  related  frequencies  of  the  spectrum 
that  is  efficient,  strong,  responsive,  tunable,  miniaturizable,  electronically  addressable 
and  cost  effective.  Present  technology,  which  often  resorts  to  "hot  wires",  an  approach 
more  than  a  century  old,  is  fragile,  unreliable,  slow  and  expensive. 

Vanderbilt  has  developed  its  diamond  microfabrication  capability  to  achieve 

diamond  thermal  emitters.  The 
"glowing"  diamond,  Figure  7.3-1, 
can  be  patterned  into  a  variety  of 
configurations  and  sizes  to  achieve 
robust,  efficient,  cost  effective,  high- 
performance,  IR  (infrared)  or  other 
frequencies  emission  element  by 
employing  Vanderbilt's  diamond 
deposition  technology.  It  is  possible 
to  change  the  frequency  in  the  device 
very  rapidly  across  a  wide  range  of 
electronically  tunable  options  and 
further  possible  to  combine  these 
elements  into  a  pixel  array. 


A.  Pattern  emitting  diamond  elements  with  a  monolithic  microelectronic 
interconnect. 

B.  Power  behavior  including  emission  from  deep  IR  to  visible. 

C.  Processing  capability  to  achieve  emission  elements  on  silicon  or  ceramic 
substrates,  attached  or  suspended. 

See  Figure  7.3-2 

Diamond  power  resistors  have  the  advantage  of  rapid  dissipation  of  joule  heating, 
assured  by  the  high  thermal  conductivity  of  the  material.  Diamond  is  excellent  for 
accommodating  temperature  cycling  because  of  its  extremely  small  thermal  coefficient  of 
expansion.  Thin  film  diamond  resistors  of  various  sizes  have  been  designed  and 
fabricated  on  an  electrically  insulative  aluminum  nitride  substrate.  The  resistors  are 
fabricated  from  microwave  plasma  assisted  chemical  vapor  deposition  process.  Boron 
doping  was  from  a  trimethylboron  (TMB)  gas  source.  The  resistance,  and  hence,  Joule  or 
I^R  heating/emission  behavior  of  diamond,  is  controlled  by  the  doping  level  in  the 
diamond. 


Resistor  glows  under  high  current  loading. 

Figure  7.3  -  1  Infra-red  pixel  of  glowing 
diamond  microresistor 

The  necessary  elements  to  achieve  this 

have  been  demonstrated  at  Vanderbilt  including: 
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Resistor  Mcno- Array  for 
I R  Display  I  rrfrared  emitter 
source  comprised  of  diamond 
Mcrostructure  pixel  element  in 
patterned  array  technology 


The  diamond  array  will  be  fabricated 
similar  to  conventional  thin  film  resistors  by 
processes  developed  by  Vanderbilt.  It  will 
be  doped  with  boron  to  be  sufficiently 
conductive  to  have  controlled  J  oule  heating 
under  applied  pcwer.  Because  the  element 
is  diamond,  there  are  many  advantages  in 
addition  to  the  modulated  set  point  such  as 
very  'crisp'  on  off  illumination  profiles 
because  of  diamond's  none- higher  thermal 
conductivity. 


DIAMOND  POWER  RESISTORS 


Diamond  Resistors  on  Aluminum  Nitride 


Delineation  of  the  resistors  is 
accomplished  by  direct  etching 
using  reactive  ion  etching  with  a 
patterned  metal  mask.  Resistors 
with  different  device  geometry 
and  doping  concentration  have 
been  examined  for  power  density 
response.  Arrays  of 

microstructure  resistors  were 
exposed  to  controlled  power 
levels  to  examine  their  thermal 
response.  Temperature  is 

correlated  to  the  spectral 

irradiance  of  the  blackbody 
thermal  emission. 


Figure  7.3  -  2  Tunable,  miniaturizable,  electronically 
addressable  diamond  micro-resistor  thermal  emitters 


The  technology  transfer/commercial  ramifications  of  this  technology  can  be  seen 
from  this  following  short  list  from  the  many  possible  applications: 

♦  A  very  fast  digital  "TV  screen"  in  the  IR  region  of  the  spectrum,  which  can  be 
used  for  biomedical  illumination  or  mapping,  security  monitoring,  and  combat/  weather/ 
medical  image  simulation.  For  example,  present  techniques  for  breast  cancer  involve 
thermal  imaging.  The  "live"  signals  can  be  recreated  on  the  diamond  element  "pixel" 
array,  then  enlarged,  "morphed"  or  otherwise  electronically  manipulated  for  enhanced 
diagnostics  and  improved  predictions. 

♦  The  diamond  emitter  will  be  a  very  unique  illumination  source  for  chemical 
analysis.  Many  analytical  instruments  for  environmental  quality  monitoring  (air,  water, 
factory  effluent,  automotive  standards,  etc.)  for  food  processing,  pharmaceuticals, 
biological  fluids  (blood,  plasma,  etc.),  oil  refining  {anywhere  that  the  substances  involve 
organic  (Carbon-Hydrogen)  materials}  use  an  IR  beam  to  illuminate,  stimulate,  or 
otherwise  analyze  known  and  unknown  samples.  The  limits  of  detection  (sensitivity),  the 
range  (variety  of  compounds),  discrimination  (ability  to  distinguish  similar  materials, 
e.g.,  "good"  cholesterol  from  bad  cholesterol)  and  many  other  parameters  are  directly 
dependent  on  the  brightness  (intensity  or  power),  response  time,  tunability  and 
complexity  that  can  be  achieved  with  the  IR  source.  The  diamond  emitters  can  be 
superior  to  present  materials  in  all  these  properties,  resulting  in  purer  drugs, 
faster/cheaper  environmental  monitoring,  quicker/more  widely  deployed  in  line  process 
control  monitors  for  everything  from  food  to  plastics  to  fuel. 

♦  The  printing/copying  process  -  an  array  of  these  IR  elements,  at  the  microsize 
achievable  with  this  technology,  could  significantly  alter  the  cost  and  size  aspects  of  the 
entire  thermal  printing  industry.  With  an  addressable  linear  or  2-D  array  of  rapid 
response  diamond  IR  (thermal)  elements,  the  fax  or  POS  receipt  or  conventional  copy 
could  be  generated  in  a  fraction  of  the  time  and  a  fraction  of  the  cost  with  a  "fraction"  of 
the  presently  configured  printer  countertop  footprint. 
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7.3.2  Objectives 

The  development  effort  was: 

♦  First,  design  and  fabrication  of  a  diamond  IR  emitting  element  as  per  Milestone  Three 
(2005),  Research  nanotechnology  capabilities  for  enhancing  imaging  sensor  performance, 
including  establishing  design  approach  for  advanced  high  definition  infra-red  displays. 

♦  Second,  characterization,  design  optimization  and  operating  limit  evaluation  of 
diamond  pixel  element  IR  array  as  per  future  work  in  Years  Two  Through  Five  (2006- 
2009): 

♦  Research  nanotechnology  capabilities  for  diamond  based  IR  pixels,  including 
construction  of  prototype  advanced  high  definition  infra-red  displays  for  enhanced 
imaging  sensor  performance 

The  prototype  diamond  IR  pixel  element  will  be  designed  in  the  first  year.  In  the 
second  year  and  out  years,  the  prototype  diamond  IR  pixel  element  will  be  fabricated  and 
tested  as  a  feasibility  demonstration.  In  subsequent  years,  the  design  and  process  will  be 
further  developed  and  characterized,  the  element  performance  evaluated  for  limit 
behavior.  Preliminary  design  for  a  multi  element  array  will  be  achieved  and  fabrication 
and  characterization  will  be  performed  in  year  2  to  3. 

These  objectives  were  achieved  for  the  duration  that  the  program  was  funded, 
said  funding  being  terminated  in  the  second  year. 

7.3.3  Results 

A  PDF  emitter  resistor  test  array,  Figure  7.3-3,  was  designed  and  fabricated  with  a 
selection  of  process  variables  and  geometries.  This  provided  an  initial  data  base, 
identifying  the  relative  importance  of  various  process  variables  and  configurations  on  the 
resulting  emitters.  This  evaluation  began  an  assessment  for  determining  methods  for 
enhancing  the  performance  of  diamond  film  resistor  IR  emitters  using  plasma  enhanced 
chemically  vapor  deposited  (PECVD)  diamond.  A  model  was  under  development  to 
interpret  the  observed  emission  performance. 


>n 

>n 


cavity 


Cross  section  of  suspended  diamond  resistor 


Top  view  of  the  diamond  resistor  array 


Figure  7.3  -  3  PDF  emitter  resistor  test  array 
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Diamond  resistors  were  prepared  and  the  data  collection  process  improved  so  that 
more  data  could  be  collected  to  more  accurately  reflect  the  capabilities  of  the  diamond 
resistors. 

For  resistor  fabrication,  diamond  layers  were  grown  on  more  samples.  Attempts 
were  made  to  grow  diamond  on  1  micron  thickness  silicon  oxide.  The  growth  was  not 
successful  as  the  diamond  would  not  adhere  to  the  oxide  layer.  The  next  attempt  was  to 
grow  doped  diamond  onto  a  thin  layer  of  i-diamond,  performed  with  a  bias  enabled 
diamond  machine  to  avoid  damaging  the  surface  of  oxide.  This  experiment  also  failed  to 
achieve  the  desired  doped  diamond  layer  due  to  the  same  adherence  problem.  Success 
was  achieved  by  growing  diamond  on  thicker  oxide  (about  3  micron  thick)  layers  after 
preparing  the  oxide  for  nucleation. 

Resistivity  was  measured  on  the  sample  using  the  van  der  Pauw  method  and 
found  to  be  -  0.06  to  0.08  Ohm-cm.  A  better  diamond  film  was  observed  on  the  thicker 
oxides. 

Earlier  measurements  made  were  inferior  due  to  the  presence  of  a  ripple  in  the 
output  signals.  This  ripple  is  believed  to  originate  from  the  power  supply  due  to  physical 
limitations  of  the  pulse  generator  hardware  as  it  switches  current  to  a  load.  The  power 
supply  input  will  be  tested  and  characterized  to  eliminate  the  ripple  that  was  present  in 
the  output  signal.  Filtering  circuitry  may  be  necessary  to  isolate  the  ripple. 

An  interface  for  Data  Acquisition  was  being  built,  Figures  7.3-4,  7.3-5,  to  enable 
the  reading  of  the  input  voltage  and  output  current  as  well  as  measure  the  light  intensity 
from  the  photodiode  in  a  synchronized  manner.  The  voltage  and  current  readings  will  be 
obtained  synchronously  in  real  time. 

A  signal  isolator  would  be  used  to  provide  more  precise  input  data  through  a 
clearer  input  signal.  Adjustments  can  be  made  as  needed  to  improve  the  filtering  process 
in  the  measurement  step. 
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TO  VOLTAGE  ISOLATOR  UNITS 


Circuit  Diagram  of  Filtering  Circuitry 


TO  DEVICE  UNDER  TEST 


TEMPERATURE 


FROM  POWER  SUPPLY 


Figure  7.3  -  4  Basic  preliminary  schematic  drawing  of  input  circuitry  for  DAQ  system 


DAQ 


Figure  7.3  -  5  Overall  preliminary  design  for  device  measurement 
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The  diamond  that  was  successfully  grown  was  etched  into  the  various  resistor 
shapes.  The  preliminary  fabrication  results  were  encouraging  and  the  decision  was  made 
to  fabricate  more  resistors  with  the  optimized  procedure.  A  p-doped  diamond  layer  was 
successfully  grown  on  thick  (~3  micron)  SiC>2.  The  diamond  layer  was  grown  for  3.5 
hours  resulted  in  a  thickness  of  about  3  micron  of  diamond.  A  thick  insulating  SiCb  is 
desired  for  a  deeper  trench  to  ensure  complete  isolation.  A  1  micron  titanium  layer  was 
deposited  through  e-beam  for  masking  the  diamond  during  delineation  process.  The 
delineation  process  of  the  resistor  was  achieved  by  RIE  process  using  the  facility  at 
Auburn  University. 

Eight  samples  were  processed  at  Auburn  and  delineated  with  the  RIE  process. 
The  etch  rate  for  RIE  from  previous  observation  was  -  O.lmicron/min.  However,  the  first 
attempt  of  RIE  process  failed  to  produce  any  resistor  pattern  because  the  titanium  layer 
lifted  off  after  6  minutes  of  etching.  Adjustment  of  the  process  parameters  was  made  to 
reduce  the  etching  rate  but  similar  observations  were  seen  on  the  second  and  third 
samples.  The  cause  of  the  failure  was  determined.  The  lxl  cm  samples  were  thermally 
attached  using  a  thermal  conductive  wax  on  a  4  inch  wafer  and  this  wafer  sat  on  spacers 
that  were  attached  to  a  separate  5  inch  wafer  (RIE  machine  only  accepts  a  5  inch  wafer). 
Previously,  the  samples  were  attached  directly  onto  the  5  inch  wafer  with  thermal 
conductive  wax  in  all  processes.  The  gap  between  the  4  and  5  inch  wafer  created  thermal 
isolation.  The  gas  in  the  chamber  that  should  be  constantly  cooling  the  wafer  couldn’t 
reach  it  and  the  high  temperature  caused  an  excessive  etch  rate  at  the  top  which 
nonuniformly  removed  the  titanium  and  part  of  the  diamond  layer. 

Although  the  problem  was  identified  and  resolved,  the  RIE  process  then  suffered 
from  a  problem  that  caused  its  reflected  power  to  be  too  high.  This  created  an  unsafe 
environment  for  the  equipment  and  the  process  had  to  be  stopped.  This  problem  with  the 
RIE  took  a  long  time  to  be  resolved  and  was  a  major  stumbling  block  for  the  fabrication 
of  the  diamond  resistors.  The  RIE  equipment  was  eventually  fixed;  another  trip  to 
Auburn  completed  the  process. 

Additional  diamond  resistors  were  fabricated,  increasing  our  stock  of  samples 
available  for  characterization  and  measurement.  The  delineation  was  performed  at 
Auburn  University  for  the  first  batch  of  resistors.  Additional  samples  were  fabricated  at 
Vanderbilt  and  work  proceeded  to  finalize  the  characterization  methods  as  the  program 
was  terminated. 

The  resistors  in  this  work  are  freestanding  and  “necked”  shaped  in  three  different 
resistor  dimensions  (10  x  50p,  50  x  100  p,  200  x  500  p),  see  Figure  7.3  -6.  The  wide 
range  in  resistor  sizes  and  the  freestanding  structure  led  to  problems  which  complicate 
optimal  characterization  of  the  resistors.  The  cavity  for  the  freestanding  structure  is 
obtained  by  etching  away  the  underlying  oxide.  To  create  the  cavity  of  the  largest 
dimension  resistor  structure  fully  excavated,  the  etching  process  has  to  be  prolonged. 
However,  this  results  in  the  smallest  structure  getting  over  etched,  Figures  7.3-7  and  7.3- 
8.  To  get  a  properly  defined  resistor  for  the  smallest  structure,  the  etching  process  has  to 
be  shortened  and  this  results  in  an  improperly  excavated  cavity  for  the  larger  structure. 
The  small  number  of  samples  that  were  previously  fabricated  didn’t  allow  for  resistor 
structures  to  be  optimized  so  a  trade-off  was  made  such  that  the  etching  process  etched 
away  part  of  the  smallest  structure  but  was  not  lengthy  enough  to  form  a  complete  cavity 
for  the  largest  structure.  The  resulting  resistors  were  all  compromised  to  a  degree.  To 
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obtain  optimal  resistors  for  every  structure  size,  more  samples  need  to  be  made  so  that 
each  sample  can  be  optimized  for  a  particular  resistor  size  while  sacrificing  the  others. 


15, 0kV  X 40.0 


Figure  7.3  -6  Overall  layout  of  the  resistors. 


Figure  7.3  -  7  Over-etched  was  observed  on 
large  small  resistor. 


Figure  7.3  -  8  Good  delineation  for 
resistor  in  the  same  process. 


A  method  of  characterization  has  been  developed  to  enable  accurate  and  reliable 
data  collection  that  would  allow  proper  characterization  of  the  device.  Earlier 
characterization  had  anomalous  spikes  which  might  have  been  caused  by  the  power 
supply.  A  way  to  negate  this  would  be  by  building  a  buffer  circuit  that  will  smooth  out 
any  surges  or  dips  from  the  power  source.  A  prototype  circuit  was  built  and  tested  and 
further  work  would  have  been  appropriate  to  complete  this.  In  addition,  a  method  for 
more  accurate  temperature  measurement  is  in  order.  The  current  temperature  probes  are 
too  far  from  the  resistors  to  provide  the  best  reading  of  their  temperature.  Ideally  a  remote 
temperature  sensing  device  should  be  used,  but  there  is  no  such  equipment  yet 
forthcoming. 

As  regards  filtering  the  input  signal,  the  presence  of  a  ripple  in  the  output  signals 
is  believed  to  originate  from  the  input  signal  and  not  caused  by  the  diamond  resistors.  A 
new  interface  for  the  data  acquisition  board  was  built  to  enable  the  reading  of  the  input 
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voltage  and  output  current  as  well  as  measure  the  light  intensity  from  the  photodiode  in  a 
synchronized  manner.  The  voltage  and  current  readings  would  be  obtained  synchronously 
in  real  time.  A  signal  isolator  would  be  used  to  provide  more  precise  input  data  with  a 
clearer  input  signal,  (Singh  J.  (1995),  7.4-37).  The  Data  Acquisition  (DAQ)  software 
would  need  to  be  reworked  to  be  more  intuitive  towards  this  specific  environment.  The 
power  supply  input  would  be  tested  and  characterized  to  eliminate  the  ripple  that  was 
present  in  the  output  signal.  Filtering  circuitry  may  be  necessary  to  isolate  the  ripple  that 
is  believed  to  originate  from  the  power  supply  hardware  as  described  previously,  Figures 
7.3-4  and  7.3-5. 


7.3.4  Conclusions 

Final  efforts  on  this  project  concentrated  on  fabricating  more  resistor  test  structures  for 
availability  and  as  well  as  preparing  the  characterization  equipment  and  methods.  The 
availability  of  fully  optimized  resistors  in  each  of  the  targeted  dimension  will  produce 
results  that  highlight  the  true  capabilities  of  the  freestanding  diamond  resistors.  This  work 
would  lead  to  accurate  data  of  the  thermal  behavior  of  free  standing  diamond  resistors. 
The  data  that  has  been  obtained  points  to  the  superiority  of  diamond  resistors  over  other 
materials,  especially  for  micro-thermal  source  purposes. 
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1.  Introduction 

As  sensor  networks  and  stand-alone  sensor  deployments  become  more  ubiquitous, 
expert  system  applications  that  use  sensor  data  will  become  common  in  a  variety  of  tasks, 
including  natural  disaster  response  and  mitigation  synthesis,  persistent  surveillance, 
including  monitoring  potential  terrorist  and  criminal  activity,  military  planning  and 
battlefield  management,  as  well  as  other  applications  in  which  sensors  must  be 
discovered,  evaluated,  and  controlled.  For  example,  the  United  States  Department  of 
Defense,  Department  of  Homeland  Security,  NATO,  and  other  government  agencies  and 
alliances  are  establishing  ubiquitous,  heterogeneous  sensing  environments  due  to  the 
requirement  of  monitoring  and  detecting  events  of  interest,  as  well  as  predicting  future 
system  states.  Data  obtained  from  a  variety  of  sensors,  such  as  acoustic  sensors,  can 
result  in  an  expert  system  detecting  a  possible  event  of  interest  and  initiating  subsequent 
problem-solving  activities.  Integration  and  fusion  of  sensor  data,  such  as  visible  and 
infrared  images  complemented  by  data  from  acoustic,  radiation,  vibration  sensors,  etc., 
may  reveal  further  knowledge  about  items  of  interest  resulting  in  an  increased  confidence 
in  a  given  hypothesis  or  the  pruning  of  a  set  of  competing  hypotheses. 

Opportunistic  composition  of  sensor  services  and  dynamic  fusion  of  data  obtained 
from  a  variety  of  heterogeneous  sensors  will  greatly  benefit  from  formal  knowledge 
representation  techniques  and  service-oriented  problem-solving  models.  Such  models, 
which  facilitate  semantic  markup  of  sensor  metadata  and  the  sensor  data  itself,  along  with 
a  sensor  service  registry,  will  support  the  discovery  of  sensor  services  satisfying  the 
constraints  specified  within  expert  system  applications.  Increasing  numbers  of  sensors 
will  be  discoverable  via  the  Web,  making  the  use  of  sensor  Web  services  critical  in  expert 
system  applications. 

The  current  standard  specified  by  the  Organization  for  the  Advancement  of 
Structured  Information  Standards  (OASIS)  for  service  discovery  within  a  service- 
oriented  architecture  (SOA)  is  Universal  Description,  Discovery,  and  Integration  (UDDI) 
(Clement  et  al.,  2004).  Despite  wide  commercial  acceptance,  the  UDDI  specification 
does  not  adequately  facilitate  autonomous  discovery  and  interoperation  of  disparate  Web 
services  via  queries  that  use  semantic  data  models. 

The  search  capability  that  conforms  to  the  UDDI  specification  lacks  adequate 
support  for  semantic  descriptions  of  Web  services,  which  is  integral  in  the  dynamic 
discovery  and  composition  process  (Paolucci  et  al.,  2002a;  Paolucci  et  al.,  2002b; 
Sivashanmugam  et  al.,  2003;  Akkiraju  et  al.,  2003;  Srinivasan  et  al.,  2004;  Luo  et  al., 
2005a;  Luo  et  al.,  2005b;  Akkiraju  et  al.,  2007;  Goodwin  et  al.,  2007).  UDDI’s  search 
capability  is  syntax-based  and  relies  solely  on  the  Extensible  Markup  Language  (XML), 
which  enables  syntactic,  but  not  semantic  query  without  additional  knowledge  layers, 
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such  as  those  offered  by  the  Semantic  Web  infrastructure.  Syntax-based  matching  alone 
lends  itself  to  application-specific  software  development  where  reuse  of  Web  services  by 
others  is  difficult.  For  example,  the  syntactic  description  of  a  Web  service’s  output  could 
be  syntactically  different  than  the  input  of  a  candidate  service  for  composition,  but  the 
two  services  may  be  semantically  compatible.  However,  an  expert  system  application 
would  not  be  able  to  discern  the  similarity  without  semantic  descriptions  and  a  plausible 
composition  may  be  missed.  The  shortcomings  of  UDDI’s  syntax-based  search 
capabilities  are  two-fold:  i)  there  is  no  guarantee  that  the  returned  sensor  service  is 
correct  even  if  it  matches  the  keyword  search  due  to  the  lack  of  semantics;  and  ii) 
opportunities  to  locate  an  appropriate  sensor  service  are  frequently  overlooked.  For 
instance,  keyword  searching  for  an  infrared  camera  does  not  return  the  service  entry  for  a 
mid- wave  camera  even  though  the  mid-wave  camera  type  is  a  specialization  of  the 
infrared  camera  type. 

Search  processes  that  utilize  semantics  facilitate  generalization  and  specialization 
of  queries  that  search  for  sensors  of  interest  through  the  understanding  of  the 
relationships  between  ontological  objects  and  services.  For  instance,  semantic 
understanding  of  sensor  performance  capabilities,  operational  capabilities,  and  physical 
properties  may  aid  in  the  discovery  of  sensor  services  through  generalization  or 
specialization  of  the  initial  query.  The  investment  in  UDDI  makes  the  augmentation  of 
semantic  search  capabilities  within  its  existing  infrastructure  of  great  interest  to  those 
developing  government  and  commercial  applications. 

The  remainder  of  this  paper  describes  an  architecture  and  initial  prototype  for 
augmenting  UDDI  with  semantics  for  the  search  of  sensor  services  and  is  organized  as 
follows:  Section  2  provides  an  overview  of  the  technologies  and  infrastructure  leveraged 
in  this  paper,  including  the  Semantic  Web,  service-oriented  architecture  (SOA),  and 
current  approaches  for  service  discovery  using  semantics.  Section  3  presents  the  SOA 
intended  to  support  expert  system  applications.  Section  4  discusses  the  implementation 
of  the  sensor  SOA  prototype.  Section  5  summarizes  the  work  presented  in  the  paper  and 
provides  direction  for  future  research. 

2.0  Background 


The  sensor  service  architecture  described  in  this  paper  leverages  Semantic  Web 
infrastructure,  Web  services,  and  service-oriented  technologies.  OntoSensor  is  a 
Semantic  Web-  compliant  ontology  developed  using  Protege  2000  (Noy  et  al.,  2002)  for 
describing  sensors  for  autonomous  processing  by  expert  system  applications 
(Russomanno  et  al.,  2005a,  Russomanno  et  al.,  2005b,  Russomanno  &  Goodwin,  2007). 
OntoSensor  remains  a  work  in  progress  that  is  undergoing  significant  revision,  but  it  does 
capture  a  subset  of  sensor  attributes  and  operational  parameters  to  support  the  prototype 
environment.  A  semantic  matchmaker  for  OntoSensor  was  developed  to  aid  in  sensor 
service  discovery.  OntoSensor  enhances  the  matching  process  of  finding  the  appropriate 
sensor  services  to  satisfy  a  subgoal  or  query  within  an  expert  system  application.  In  the 
prototype,  OntoSensor’ s  taxonomic  hierarchy  is  used  to  determine  the  type  of  a  specific 
sensor  instance  that  is  listed  in  a  service  registry  or  to  find  generalized  or  specialized 
results  that  may  partially  match  a  given  query  to  locate  sensors  having  specific  properties. 


125 


The  matchmaker  enhances  the  initial  syntax-based  UDDI  search  by  extracting  the 
portion  of  semantic  metadata  from  OntoSensor  that  is  pertinent  to  satisfying  the  query 
and  storing  the  metadata  in  UDDI.  If  the  expert  system’s  query  results  in  hits,  that  is, 
sensor  services  that  match  an  expert  system’s  particular  subgoal  are  directly  specified  in 
the  UDDI  registry,  then,  these  sensors’  descriptions  are  returned;  otherwise,  the  UDDI 
sensor  service  interface  will  attempt  to  use  semantic  processing  via  OntoSensor  that  may 
semantically  match  (to  some  degree)  the  expert  system’s  query.  If  such  matches  are 
found,  then,  the  ontological  information  from  OntoSensor  pertinent  to  the  specific  sensor 
facts  that  are  published  in  the  registry  is  inserted/updated  into  the  UDDI  tModels,  much 
like  a  conventional  cache,  for  use  in  subsequent  queries. 

Preliminary  aspects  of  the  development  of  the  ontology -based  SOA  have  been 
presented  by  Goodwin  et  al.  (Goodwin  et  al.,  2007;  Goodwin  &  Russomanno,  2006; 
Russomanno  &  Goodwin,  2007);  however,  these  work-in-progress  publications  do  not 
contain  the  details  presented  in  this  paper. 

2.1  Semantic  Web 

The  Semantic  Web  effort  seeks  to  establish  a  collaborative  Web  through  sharing 
and  reuse  of  machine-interpretable  conceptualizations  of  knowledge  that  can  be  used  by 
intelligent  computer  systems,  including  expert  systems.  The  Semantic  Web  effort 
focuses  on  defining  and  moving  ontologies  and  knowledge  representation  standards  from 
traditional  stand-alone  systems  to  the  highly-distributed  World  Wide  Web  (Berners-Lee 
et  al.,  2001,  Fensel  et  al.,  2003;  McGuiness,  2003). 

XML  provides  the  syntax  to  describe  documents,  objects,  attributes,  and  other 
resources  on  the  Web.  However,  the  context  of  the  document  is  primarily  intended  for 
human  interpretation,  making  interoperability  difficult,  since  the  meaning  of  the  XML 
tags  may  only  be  known  by  the  developer  or  may  be  ambiguous  to  others,  particularly 
computer  algorithms,  without  formal  semantics.  The  Resource  Description  Framework 
(RDF)  is  an  XML-based  language  to  describe  the  XML  tags  providing  an  elementary, 
machine-interpretable  knowledge  representation  and  metadata  for  the  Web  (Brickley  & 
Guha,  2000).  Resource  Description  Framework  Schema  (RDFS)  extends  the  capabilities 
of  RDF  by  adding  a  vocabulary  to  describe  properties  and  inheritance  relations.  The 
Web  Ontology  Language  (OWL)  (Smith  et  al.,  2003)  extends  RDFS  by  adding  the 
constructs  to  define  properties,  such  as  cardinality  constraints  and  disjoint  classes,  and 
other  semantic  constructs.  OntoSensor  was  implemented  using  OWL.  RDF,  RDFS,  and 
OWL  are  the  immediate  layers  above  XML  and  comprise  a  portion  of  the  Semantic  Web 
infrastructure. 

2.2  Service-oriented  architecture  (SOA) 

Service-oriented  architectures  (SOAs)  strive  to  establish  principles  and  standards  for 
description,  discovery,  connection,  and  communication  of  Web  services  (Singh  &  Huhns, 
2005).  Web  services  are  distributed,  functional  units  accessible  via  a  network 
environment  that  are  modular  and  self-describing.  A  Web  service  is  identified  by  a 
universal  resource  identifier  (URI)  and  contains  an  interface  description  describing  the 
capabilities  of  the  module,  communication  protocol,  and  port  locations.  Success  of  the 
SOA  depends  on  communication  standards  for  discovery  and  invocation  of  the 
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distributed  applications.  Furthermore,  a  framework  is  required  for  publishing  and 
describing  Web  services  so  that  applications  can  locate  the  requisite  services. 

UDDI  is  the  current  OASIS  standard  for  Web  service  discovery.  Figure  1  depicts 
a  generic  model  for  an  SOA  using  UDDI  which  provides  standards  for  description, 
discovery,  connection,  and  communication  between  Web  services.  The  architecture 
consists  of  the  following:  i)  service  provider  that  registers  descriptions  of  services;  ii) 
broker  that  maintains  a  registry  of  services  and  providers;  and  iii)  service  requestors  who 
search  the  registry  for  available  services. 


Figure  1.  Generic  model  for  SOA 


The  following  description  of  the  UDDI  specification  follows  closely  from  (Singh 
&  Huhns,  2005).  UDDI  is  an  on-line  registry  where  service  providers  can  register  Web 
services  providing  descriptions  that  can  later  be  discovered  by  service  requestors.  UDDI 
provides  an  application  programmer’s  interface  (API)  to  query  and  publish  Web  services 
using  Simple  Object  Access  Protocol  (SOAP)  for  messages.  Web  Service  Description 
Language  (WSDL)  provides  XML  descriptions  for  inputs,  outputs,  bindings,  etc.  The 
WSDL  service  descriptions  are  mapped  to  the  UDDI  data  structures  where  they  are  later 
searched  when  fulfilling  service  requests.  UDDI  consists  of  two  APIs:  inquiry  and 
publish.  The  APIs  are  a  series  of  predefined  functions  that  accept  SOAP  messages  as 
input  and  return  SOAP  messages  as  output.  The  APIs  are  used  to  search  and  publish 
services  and  descriptions  of  the  services  in  the  UDDI  repository. 

UDDI’s  core  XML  data  structures  consist  of  five  objects:  publisherAssertion, 
businessEntity,  businessService,  bindingTemplate,  and  tModel.  The  businessEntity 
object  contains  information  pertaining  to  a  service  provider,  such  as  contact  information, 
alias  names,  etc.  A  service  provider  is  required  to  register  in  UDDI  before  it  can  provide 
service  advertisements.  The  businessService  object  contains  metadata  describing  the 
capabilities  of  the  service  and  a  key  linking  the  service  to  the  service  provider.  The 
bindingTemplate  object  provides  the  service  requestor  with  the  ability  to  access  and 
invoke  the  Web  service.  The  publisherAssertion  object  allows  service  providers  to  form 
associative  links.  For  instance,  a  company  may  have  many  departmental  organizations 
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that  provide  different  Web  services.  Each  of  the  data  objects  contains  text  fields  and 
technical  models  (tModels)  for  inclusion  of  additional  metadata.  The  tModel  objects 
permit  external  links  to  resources  outside  of  the  UDDI  framework  providing  overall 
extensibility  for  the  system. 

2.3  Approaches  to  augment  UDDI  with  semantics 

This  section  briefly  outlines  some  of  the  representative  approaches  for  augmenting 
UDDI  with  semantic  search  capabilities.  The  focus  is  primarily  on  the  approaches 
presented  by  Luo  et  al.  (Luo  et  al.,  2005a;  Luo  et  al.,  2005b)  and  Akkiraju  et  al.  (Akkiraju 
et  al.,  2007)  since  these  applications  were  the  most  influential  in  the  development  of  the 
sensor-based  architecture  presented  in  this  paper. 

A  rough  classification  of  representative  approaches  for  augmenting  UDDI  with 
semantic  capabilities  is  shown  in  Figure  2.  This  particular  classification  is  based  upon 
where  the  knowledge  for  semantic  processing  is  stored,  that  is,  internal  semantic  schema 
within  UDDI  tModels  or  external  semantic  schema. 


Figure  2.  Approaches  for  augmenting  UDDI  with  semantics 


Approaches  that  include  an  internal  semantic  schema  within  UDDI  must  load 
semantic  content  into  the  UDDI  registry  via  tModels,  which  often  results  in  the  loss  of 
semantics.  After  the  development  of  a  matchmaker  to  process  the  internal  schema,  then, 
the  tModels  can  be  traversed  to  attempt  to  satisfy  a  query.  Additional  knowledge  and  data 
may  be  inserted  into  or  deleted  from  the  tModels.  The  external  matchmaker  resides 
outside  of  the  UDDI  registry  and  is  invoked  when  a  request  is  received.  The  semantic 
matcher  may  provide  query  expansion  and  generalization/specialization  to  locate 
semantically  similar  services. 

Referencing  external  schema  also  involves  the  limited  use  of  tModels.  When  a 
provider  registers  a  particular  service  an  accompanying  reference  is  inserted  in  the 
tModel  so  that  the  semantic  matcher  can  access  the  external  semantic  data  model  for 
subsequent  processing.  However,  in  this  approach,  the  tModel  is  used  exclusively  to  hold 
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the  reference  to  the  external  semantic  data  model  and  does  not  hold  the  model, 
knowledge,  or  data  itself. 

UDDI  API  modifications  are  often  coupled  with  an  internal  semantic  schema, 
external  matchmaker,  or  external  semantic  schema  for  query  enhancement.  Such 
modifications,  such  as  (Sivashanmugam  et  al.,  2003),  allow  for  the  extension  of  the 
existing  search  capabilities  and  the  semantic  registration  of  services  via  custom  APIs. 

The  disadvantages  of  such  custom  approaches  are  that  they  do  not  maintain  the  UDDI 
standard- API,  possibly  limiting  reuse  in  a  variety  of  applications. 

The  Luo  et  al.  (Luo  et  al.,  2005a,  Luo  et  al.,  2005b)  approach  bulk  loads  entire 
ontologies  into  the  UDDI  registry  via  extensive  use  of  tModels.  An  ontology  aware 
matchmaker  can  then  make  use  of  the  ontology  stored  in  UDDI,  as  well  as  service 
instances  that  reference  the  stored  ontology.  The  OWL-S  service  description  for  a  new 
service  maps  to  the  UDDI  data  model  objects.  The  data  objects  in  the  tModels  reference 
the  ontology,  which  was  bulk  loaded  into  UDDI.  The  realization  of  the  system  is 
analogous  to  developing  an  object-oriented  conceptual  model,  but  losing  semantics  when 
implementing  the  conceptual  design  in  a  relational  database.  For  instance,  OWL  language 
constructs  such  as  subPropertyOf  Class ,  and  subClassOf  are  retained,  but  the  semantics 
entailed  from  minCardinality ,  maxCardinality ,  and  FunctionalProperty  are  lost. 
Furthermore,  the  reliance  upon  the  elementary  query  capabilities  of  UDDI  will  further 
limit  the  inference  capabilities  of  an  expert  system  application. 

The  architecture  developed  by  Akkiraju  et  al.,  2007  uses  a  variety  of  techniques; 
hence,  it  is  classified  as  an  approach  that  uses  multiple  inheritance  as  shown  in  Figure  2. 
This  approach  is  composed  of  a  generic  Web  service  proxy,  constraint  checker,  dynamic 
binder  and  invoker,  and  semantic  UDDI  infrastructure.  The  generic  Web  service  proxy 
receives  service  requests.  The  generic  Web  service  proxy  can  be  viewed  as  the 
moderator  among  semantic  UDDI,  constraint  checker,  and  the  dynamic  binder  and 
invoker.  The  generic  Web  service  proxy  sends  a  request  to  the  UDDI  registry  to  find 
available  services  based  upon  the  user  query.  The  UDDI  registry  invokes  a  semantic 
matching  engine  and  returns  the  set  of  available  services  to  the  proxy  server.  The  generic 
Web  service  proxy  then  invokes  the  constraint  checker  which  takes  the  set  of  candidate 
services  and  checks  for  compatibility  between  the  services  and  ensures  that  they  meet  the 
service  constraints  of  the  initial  query.  Finally,  the  generic  Web  service  proxy  invokes 
the  dynamic  binder  and  invoker.  The  dynamic  binder  orchestrates  the  execution  chain 
among  the  candidate  Web  services  and  then  the  invoker  invokes  the  chain  to  satisfy  a 
particular  goal.  TModels  are  used  to  reference  additional  external  RDF  semantic 
information,  such  as  ontologies  or  instance-specific  data.  The  architecture  allows  for  the 
registry  use  with  or  without  semantic  matching  depending  on  the  specification  of  the 
service  requestor.  The  semantic  matcher  is  registered  in  the  UDDI  registry  as  a  service. 
The  matcher  is  invoked  dynamically  based  upon  its  specialized  domain  allowing  for 
selection  of  the  best  matcher  given  the  query  criteria.  The  matchmaker  architecture  is 
viewed  as  a  service  so  the  inference  capabilities  are  constricted  only  by  the 
implementation  of  the  matchmaker. 

3.0  Prototype  Overview 
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This  section  presents  the  design  of  an  SOA  that  can  be  used  to  support  expert  system 
applications.  Section  3.1  provides  additional  details  about  OntoSensor,  which  is  used  in 
the  prototype  for  knowledge  base  support  for  discovering  sensor  services.  Section  3.2 
presents  the  design  of  the  sensor  discovery  architecture.  Section  3.3  discusses  the  use  of 
the  sensor  registry  knowledge  base  as  a  cache  whereby  a  subset  of  OntoSensor  pertinent 
for  satisfying  a  query  is  stored  in  the  registry’s  tModels  to  aid  in  satisfying  subsequent 
queries. 

3.1  OntoSensor 

Ontological  engineering  focuses  on  developing  shared,  machine-readable 
conceptualizations  of  knowledge  discemable  by  intelligent  computer  systems  (Lenat  & 
Guha,  1989;  Skuce  &  Monarch,  1990;  Uschold  &  Gruninger,  1996;  Chandrasekaran  et 
al.,  1999).  For  purposes  of  this  paper,  a  conceptualization  of  declarative  knowledge  as 
described  by  Genesereth  and  Nilsson  (Genesereth  &  Nilsson,  1987)  defines  an  ontology 
and  it  includes,  but  is  not  necessarily  limited  to,  the  following  as  described  in 
(Russomanno  et  al.,  2005a;  Russomanno  et  al.,  2005b): 

•  the  classes  to  which  objects  belong  (e.g.,  sensor  types) 

•  the  class  hierarchy  or  taxonomic  structure  (e.g.,  set  of  radiant  sensors  is  a  subset  of  all 
sensors) 

•  the  relational  basis  set  among  the  classes  (e.g.,  a  sensing  element  is  part  of  a  sensor) 

•  the  functional  basis  set  among  the  objects  (e.g.,  bandwidth(‘JERS  SAR’)  =  1.275 
GHz) 

•  the  capability  for  executing  special  programs  or  procedures  for  evaluating  the  truth  of 
literals  or  other  properties  (e.g.,  procedural  attachment) 

Once  a  basic  ontology  has  been  defined,  a  language,  which  can  be  unambiguously 
interpreted  by  a  computer,  can  be  used  to  express  knowledge  using  concepts  defined  in 
the  ontology.  OntoSensor  is  a  Semantic  Web-compatible  ontology  initially  developed 
using  Protege  2000  (Stanford,  2004)  and  is  exportable  to  OWL.  OntoSensor  references 
and  extends  the  IEEE  Suggested  Upper  Merged  Ontology  (SUMO)  (Niles  &  Pease, 

2001),  which  defines  general  concepts  and  associations.  OntoSensor  was  influenced  in¬ 
part  by  a  preliminary  definition  of  SensorML  (Botts  et  al.,  2004),  which  provides  a 
framework  consisting  of  a  series  of  UML  class  diagrams  with  associations  realized  in 
XML  for  describing  sensors.  OntoSensor  deviates  from  SensorML  in  that  SensorML 
lacks  the  semantic  richness  provided  by  ontologies,  such  as  axiomatic-grounded  terms, 
which  may  be  required  for  automated  data  fusion  and  inference  in  a  distributed  sensing 
environment. 

OntoSensor  includes  knowledge  models  for  a  variety  of  data  acquisition  boards, 
sensors,  and  processor/radio  units  common  in  commercial  wireless  sensing  environments, 
as  well  as  preliminary  definitions  for  a  variety  of  wired  imaging  sensors,  such  as  those 
provided  by  Sony  and  FLIR  (forward  looking  infrared  system)  Systems.  OntoSensor 
contains  a  taxonomic  hierarchy  of  sensor  classes  and  the  knowledge  model  for  a  given 
sensor  type  includes  metadata,  such  as  sensitivity  and  other  performance  parameters  for 
the  sensing  elements,  as  well  as  physical  characteristics,  such  as  mass,  radio  frequencies, 
dimensions,  and  power  supply  information  for  wireless  motes.  An  excerpt  of  OntoSensor 
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showing  the  taxonomic  hierarchy  from  the  class  FLIR  up  through  the  Sensor  class  is 
shown  in  Figure  3. 


Generated  by  SemanticWorks  www.altova.com 

Figure  3.  Excerpt  of  OntoSensor 
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3.2  Overview  of  sensor  service-oriented  architecture  (SOA) 

Service-oriented  architecture  (SOA)  provides  the  standard  communication  protocol, 
as  well  as  methodology  to  discover,  invoke,  and  publish  sensor  Web  services.  As  shown 
in  Figure  4,  the  prototype  realization  of  SOA  creates  a  sensor  service  provider  that 
includes  the  base  stations  that  serve  as  a  connectivity  points  between  the  sensor  service 
requestor  and  the  physical  sensors.  The  sensor  service  provider  registers  its  respective 
services  with  a  broker,  aggregates  and  stores  data  from  its  respective  sensor  deployments, 
and  processes  queries  either  by  retrieving  archived  data  or  tasking  the  deployed  sensors 
in  its  network.  Stand-alone  sensors  may  also  serve  as  providers. 


Figure  4.  Generic  model  for  sensor  services 


The  prototype  environment  is  comprised  of  two  computers  that  serve  as  base 
stations  and  store  the  data  collected  from  the  MIB510  and  MIB520  network  gateways 
provided  by  Crossbow  (Crossbow,  2006).  The  base  stations  register  their  respective 
services  with  the  sensor  service  broker.  The  sensor  service  broker  facilitates  semantic 
discovery  of  the  registered  services.  Each  base  station  receives  data  from  a  sensor 
network.  The  wireless  sensors  form  ad-hoc  communication  links  to  route  the  collected 
data  back  to  the  base  station.  The  base  stations  run  Crossbow’s  MoteView  application  to 
retrieve  data  from  the  network  gateways  and  the  data  is  stored  in  a  PostGRE  database.  In 
addition,  each  base  station  executes  custom  software  that  extracts  sensor  data  and 
metadata  into  OWL  repositories  that  commit  to  OntoSensor.  The  software  is  a 
preliminary  implementation  of  a  Web  service  that  is  evoked  at  the  base  stations  for 
selectively  storing  sensor  data  that  references  OntoSensor  within  the  networked 
environment. 

Successful  reference  of  OntoSensor  requires  a  priori  knowledge  of  the  sensor 
platform  class  associated  with  a  given  node  identification  number.  Later  development 
will  circumvent  this  dependency  through  custom  software  development  on  the  processor 
boards  so  that  they  are  self  describing.  The  platform  class  is  used  as  a  primary  key  to 
retrieve  inherited  and  unique  properties  defined  through  constraints  in  OntoSensor  that 
are  exported  to  OWL. 
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The  base  stations  also  serve  as  a  logical  interface  for  tasking  the  individual  sensors 
through  Web  services.  Base  stations  also  provide  physical  storage  aggregation  nodes  for 
sensor  data  and  metadata  that  can  be  used  in  response  to  queries  by  expert  system 
applications  or  other  agents,  rather  than  involving  the  individual  sensor  nodes  in 
responding  directly  to  queries.  Figure  5  depicts  an  abridged  view  of  the  prototype 
environment. 


cn 


O 


01 

O 


CO 


CO 


CM 


OH 


01 

O 


CO 


CO 


Figure  5.  Prototype  Environment  (Abridged)  (Russomanno  &  Goodwin,  2007) 


3.3  Sensor  service-oriented  architecture  (SOA) 

The  service-oriented  architecture  (SOA)  is  composed  of  three  modules;  i)  sensor 
service  interface,  ii)  UDDI,  and  iii)  semantic  sensor  matchmaker.  The  sensor  service 
interface  is  a  preprocess  that  can  be  thought  of  as  a  proxy  server  that  routes  requests  and 
invokes  matchmaking  services  when  needed.  The  semantic  sensor  matchmaker  is 
invoked  by  the  sensor  service  interface,  if  the  service  requestor  specifies  semantic 
matching  and  works  in  conjunction  with  UDDI  to  discover  sensor  services  using 
semantics.  The  semantic  matchmaker  has  the  following  capabilities  when  searching  for 
sensors;  i)  generalization  of  sensor  types;  ii)  specialization  of  sensor  types;  iii)  metadata 
processing  of  specific  sensor  instances;  iv)  supported  application  processing;  and  v) 
performance  property  processing. 

The  overall  architecture  is  shown  in  Figure  6  and  leverages  the  architecture  proposed 
by  Akkiraju  et  al.  (Akkiraju  et  al.,  2007)  and  allows  for  maximum  inference  capabilities 
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by  an  expert  system,  query  of  instance  specific  metadata,  and  the  use  of  multiple 
matchmakers.  The  adopted  approach  avoids  precluding  various  inference  capabilities, 
such  as  functional  properties  whose  use  is  limited  in  (Luo  et  ah,  2005a;  Luo  et  ah,  2005b) 
and  provides  semantic  capabilities  beyond  input/out  generalization  as  in  (Paolucci  et  ah, 
2002a;  Paolucci  et  ah,  2002b).  Furthermore,  the  architecture  allows  for  domain-specific 
matchmakers  and  use  of  external  semantic  schema  which  is  required  for  sensor  service 
discovery  not  present  in  some  of  the  proposed  architectures  (Paolucci  et  al.,  2002a; 
Paolucci  et  al.,  2002b;  Sivashanmugam  et  al.,  2003;  Akkiraju  et  al.,  2003;  Srinivasan  et 
al.,  2004;  Luo  et  al.,  2005a;  Luo  et  al.,  2005b). 


Figure  6.  Sensor  discovery  service-oriented  architecture  (Goodwin  et  al.,  2007) 


The  UDDI  sensor  service  interface  facilitates  communication  among  the  agent  or 
expert  system  application,  matchmaker,  and  registry.  The  sensor  service  interface  allows 
for  extension  of  the  UDDI  API  while  keeping  the  standard  API  intact.  If  the  expert 
system  application  or  agent  specifies  semantic  matching,  the  UDDI  registry  is  searched  to 
discover  the  external  ontologies  associated  with  the  registered  services.  The  sensor 
service  interface  then  searches  the  UDDI  registry  to  find  matchmakers  that  provide 
services  within  the  domain  of  a  given  ontology.  If  a  matchmaker  is  discovered,  the 
sensor  service  interface  invokes  the  matchmaker  which  performs  semantic  matching  on 
the  query  in  conjunction  with  the  registered  services  in  the  UDDI  registry. 

A  semantic  sensor  matchmaker,  utilizing  OntoSensor,  aids  in  the  discovery  of 
sensors  and  sensor  Web  services.  The  semantic  sensor  matchmaker  can  be  viewed  as  an 
instance  of  a  matchmaker  Web  service.  Regarding  matchmakers  as  instances  allows  for 
future  addition  of  sensor  matchmakers  that  can  be  specialized  within  a  domain  of  sensor 
types  or  composition  of  sensor  services.  For  example,  sensor  Web  services  can  be 
composed  of  specific  algorithms  that  take  raw  sensor  data  as  input  and  provide  some 
service  on  the  data,  such  as  noise  reduction  or  a  classification  task  on  the  data  that  may  be 
directly  used  by  the  expert  system  application  or  serve  as  input  to  another  service.  This 
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framework  allows  for  specialized  matchmakers  that  can  work  collaboratively  to  locate 
the  appropriate  sensors  and  other  algorithms  capable  of  processing  the  data  from  the 
given  sensors,  to  compose  these  services,  and  to  satisfy  a  high-level  goal  of  an  expert 
system  application. 

It  is  a  goal  that  OntoSensor  will  ultimately  contain  the  majority  of  the  semantic 
data  model  required  for  search  for  several  types  of  sensors.  However,  specific  sensor 
instances  may  contain  properties  that  the  matchmaker  must  load  at  query  time.  For 
instance,  the  geo-location,  orientation,  and  lens  type  will  vary  among  the  various 
instances  of  thermal  infrared  imaging  cameras.  An  external  reference  is  adopted  in  the 
prototype  to  provide  external  links  to  instance  specific  data.  The  DescribedUsing  tModel 
allows  for  instances  to  reference  additional  semantic  schema  or  instance  data  by 
including  a  pointer  to  the  location  of  the  external  description.  The  external  description  is 
then  loaded  by  the  matchmaker  when  needed. 

A  robust  matchmaker  is  required  for  sensor  service  discovery  and  is  beyond  the 
scope  of  input/output  matching  or  simple  capability  matching.  Input/output  matching  is 
crucial  in  service  compositions  where  a  binding  may  be  missed  or  mistaken  for 
compatibility  due  to  the  lack  of  semantics.  However,  sensor  Web  services  require 
support  for  queries,  such  as  supported  applications,  day/night  operation,  and  instance 
specific  properties  such  as  lens  type,  resolution,  geo-location,  etc.  For  example,  the 
semantic  sensor  matchmaker  receives  the  following  request:  “Infrared_Camera 
location=xxx.”  The  matchmaker  first  searches  the  UDDI  to  see  if  there  are  any  instances 
of  Infrared_Camera.  In  this  scenario,  the  UDDI  does  not  contain  any  entries.  The 
matchmaker  then  finds  all  subclasses  of  Infrared _Camer a  and  searches  the  UDDI  again. 
FLIRThermaCam  is  found  in  the  registry  which  is  a  subclass  of  Infrared _Camer a.  The 
matchmaker  then  obtains  the  external  link  to  the  instance  specific  data  and  determines 
that  the  camera  is  in  location  “xxx.”  The  bindings  of  the  services  are  then  returned  to  the 
service  requestor. 

3.4  UDDI  tModel  as  cache 

This  section  describes  using  the  UDDI  tModel  as  a  form  of  cache  to  enhance  syntax- 
based  search.  This  section  follows  closely  from  (Goodwin  et  al.,  2007).  A  portion  of  the 
UDDI  model  shown  in  Figure  6  can  be  viewed  as  a  cache.  If  the  service  requester’s  query 
is  satisfied  directly  by  the  service  specifications  in  the  UDDI  repository,  then,  the 
sensors’  bindings  are  returned;  otherwise  the  sensor  matchmaker  is  invoked  to  find 
generalizations  or  specializations  that  are  semantically  similar  to  the  initial  query. 
Ontological  knowledge  pertinent  to  the  semantically  similar  sensor  is  updated/inserted 
into  the  UDDI  tModels.  The  functionality  is  similar  to  a  conventional  cache  whereby 
subsequent  queries  can  be  satisfied  using  the  registry  without  the  aid  of  the  matchmaker. 
The  sequence  diagram  in  Figure  7  depicts  the  use  of  the  UDDI  repository  as  a  form  of 
cache  in  the  implemented  prototype  to  satisfy  a  query  for  a  sensor  with  the  specified 
capability. 

The  approach  does  not  seek  to  seamlessly  integrate  or  bulk  load  OntoSensor  into  the 
UDDI  tModel,  but  it  provides  a  means  to  enhance  the  syntactic  matching  initially  used 
for  queries.  For  example,  in  Figure  7  the  request  for  a  sensor  service  with  capability  Y 
initially  fails  without  matchmaking  being  specified  in  the  query,  but  it  is  satisfied  using 
semantic  matching  in  conjunction  with  the  knowledge  about  sensors  in  Figure  8.  The 
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matchmaker  uses  the  knowledge  to  deduce  that  Sensor  X  is  a  subclass  of  Sensor  Y; 
therefore,  it  also  has  capability  Y.  A  subsequent  analogous  query  succeeds  without 
matchmaking  due  to  the  explicit  assertion  of  these  deduced  facts  in  the  tModel  in  UDDI. 
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Figure  7.  Sequence  diagram  for  matchmaking  search  (Goodwin  et  ah,  2007) 
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B:  Sensor  X 


C:Sensor  X 


Figure  8.  Excerpt  of  ontological  sensor  knowledge  (Goodwin  et  al.,  2007) 
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Saving  the  relevant  portion  of  the  semantic  data  model  in  the  UDDI  tModels  after 
matchmaking  using  an  external  ontology  can  be  thought  of  as  a  deep  knowledge 
approach  that  supports  subsequent  shallow  queries  using  a  heuristic  approach.  Consider 
the  following  events  in  which  steps  1-8  correspond  with  Figure  7: 

1 .  A  service  provider  registers  sensor  B  with  capability  X. 

2.  A  service  requestor  queries  for  a  sensor  with  capability  Y. 

3.  UDDI  is  searched  and  no  hits  are  found  for  the  request. 

4.  A  service  requestor  requests  matchmaking  for  a  sensor  with  capability  Y. 

5.  The  matchmaker  discovers  that  sensor  B  is  an  instance  of  Sensor  X  which  has 
capability  Y. 

6.  UDDI  is  searched  for  sensor  B  and  it  is  found. 

7.  The  bindings  for  sensor  B  are  returned  to  the  requestor. 

8.  Data  for  sensor  B  is  saved  in  the  UDDI  tModel  so  that  it  is  explicit  that  it  has 
capability  Y. 

9.  A  service  provider  registers  sensor  C  with  capability  X. 

10.  A  service  requestor  queries  for  a  sensor  with  capability  Y. 

11.  Sensor  B  is  found  in  UDDI  and  returned  to  the  service  requestor  (sensor  C  is  not 
found). 


In  response  to  the  query  in  step  10,  sensor  B  is  found  in  step  1 1  through  syntax- 
based  search  via  the  UDDI  tModel  since  the  explicit  knowledge  needed  to  satisfy  the 
query  was  loaded  into  “cache”  from  a  prior  deep  query  (step  1-8).  In  step  11,  sensor  C  is 
not  found  in  the  UDDI  repository  by  the  service  requestor’s  initial  syntax-based  query 
(step  10).  On  initial  inspection,  this  may  appear  to  be  an  oversight.  However,  the  returned 
services  can  be  evaluated  by  the  requestor  to  determine  if  they  fulfill  the  needs  of  the 
expert  system  application  or  agent.  If  the  returned  services  are  not  satisfactory,  the  agent 
will  query  again  by  specifying  semantic  matching.  For  example,  a  subsequent  deeper 
probe,  as  shown  by  operation  N  in  Figure  9  would  discover  that  sensor  C  has  capability 
Y.  The  process  is  analogous  to  using  a  set  of  compiled  heuristics  first  (that  is,  the  explicit 
facts  in  the  UDDI  tModel  repository)  then,  falling  back  on  deeper  knowledge  (the  sensor 
ontology),  if  the  heuristics  do  not  provide  an  acceptable  solution. 


UDDI  Facts 


|  Operation  1  bas^apab|nty_X| 

|  Operation  8 


B  has  capability  Y1 


|  Operation  9 

| Operation  N 


C  has  capability  Y\ 


Figure  9.  UDDI  repository  (Goodwin  et  al.,  2007) 


4.0  Prototype  Implementation 
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This  section  provides  a  detailed  review  of  the  design  and  implementation  of  the 
sensor  service-oriented  prototype.  This  architecture  includes  the  Prolog  implementation 
of  a  UDDI-  inspired  repository  (referred  to  as  the  Prolog  UDDI  registry),  sensor 
matchmaker,  and  sensor  service  interface. 

4.1  Prolog  UDDI  schema 

The  conceptual  model  developed  for  the  UDDI  prototype,  which  was  implemented 
using  Prolog,  is  shown  in  Figure  10.  This  model  omits  some  of  the  class  definitions  from 
the  full  UDDI-standard  specification  and  focuses  on  the  subset  of  the  model  that  is 
pertinent  for  representing  semantic  sensor  services. 

The  UDDI  conceptual  model  in  Figure  10  is  realized  using  a  UML  to  relational 
database  conversion  strategy  and  implemented  using  Prolog.  The  Prolog  facts 
corresponding  to  a  particular  functor  and  arity  are  analogous  to  tuples  within  a  given 
table,  while  foreign  key  attributes  represent  relations  between  tables.  Example  Prolog 
functors  include:  businessEntity/9  which  represents  the  sensor  service  provider; 
businessService/6  which  represents  the  sensor  services  offered  by  the  provider  that  are 
described  using  the  businessService/6  descriptions. 


Figure  10.  Excerpt  of  the  UDDI  conceptual  model  used  in  prototype 

As  an  example  of  the  implementation  details,  Figure  1 1  is  the  Prolog  schema 
representing  businessService/6.  In  this  schema,  Name  is  the  unique  key  reference  to  a 
class  that  is  described  in  an  ontology.  The  registered  service  is  an  instance  of  a  class  in 
the  ontology  defined  by  the  Name  attribute.  For  example,  in  Figure  12,  the  Name 
attribute  has  value  MEP4 1 0_Platform ,  which  is  an  instance  of  the  MEP410_Platform 
class  described  in  OntoSensor.  The  Name  attribute  is  used  when  matchmaking  is 
specified  to  aid  in  semantic  processing.  Services  provided  by  a  businessEntity/9  are 
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retrieved  using  the  foreign  key  BusinessKey  attribute  stored  in  the  business  Service/6  fact 
of  a  sensor  service. 


businessService(Name,  Description, 
BindingTemplates,  CategoryBag, 
ServiceKey,  BusinessKey). 

businessService(‘MEP4 1  OPlatform’ ,_, 
[[‘accessPoint’,  ‘URL’]], 

[[‘describedUsing’,  ‘OntoSensor.owl’]], 

‘1V1’). 

Figure  11.  Prolog  businessService 
schema 

Figure  12.  Prolog  businessService  fact 

The  CategoryBag  and  BindingTemplate  attributes  represent  the  categoryBag  and 
bindingTemplate  XML  structures  and  are  composed  of  lists  containing  KeyReference  and 
KeyValue  attributes.  The  CategoryBag  attribute  is  used  to  provide  categorical  or 
descriptive  metadata  for  a  service.  The  BindingTemplate  attribute  is  used  to  provide 
metadata  for  accessing,  tasking,  and  invoking  a  service.  The  KeyReference  attribute 
references  a  domain  tModel  and  is  instantiated  using  the  KeyValue  attribute.  In  Figure 
12,  the  KeyReference  attribute  has  the  value  des crib edU sing ,  which  is  instantiated  with 
the  KeyValue  attribute  having  value  Onto  Sens  or.  owl,  which  indicates  the 
businessService  is  described  by  OntoSensor.  The  describedUsing  tModel  is  used  to  link 
registered  services  to  external  knowledge  stores.  In  the  prototype  implementation,  the 
KeyReference  with  value  accessPoint  is  used  to  describe  the  URL  for  connecting  with 
the  sensor  service  to  obtain  metadata  or  to  task  the  service. 

4.2  Mapping  semantic  data  models  to  UDDI  data  structures 

In  the  UDDI  specification,  the  tModel  XML  structure  is  an  independent  reference 
table  allowing  for  domain  definition  and  instantiation  using  the  keyedReference  structure 
of  the  categoryBag  (Clement  et  al.,  2004).  The  categoryBag  schema,  shown  in  Figure  13, 
is  composed  of  references  that  point  to  existing  tModels  via  the  tModelKey  attribute. 

The  tModelKey  attribute  points  to  the  domain  that  the  KeyValue  attribute  instantiates. 

The  UDDI  repository  uses  the  categoryBag  structure  to  represent  shallow  taxonomic 
properties  and  is  used  here  to  store  semantic  data  into  the  UDDI  tModels.  Existing 
approaches  use  tModels  and  the  categoryBag  structure  of  the  tModels  to  bulk  load  the 
entire  ontology  into  a  UDDI  database  (Luo  et  al.,  2005a,  Luo  et  al.,  2005b).  The 
approach  taken  in  the  implementation  of  the  prototype  seeks  to  enhance  subsequent 
syntactic  search  after  matchmaking  by  saving  metadata  in  the  UDDI  tModels  and  not 
explicitly  loading  an  entire  ontology  into  the  UDDI  repository. 


Figure  13.  categoryBag  schema 
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TModels  are  used  to  describe  properties  such  as  inheritance  relations, 
applications,  and  capabilities.  For  instance,  subclass  Of  des  crib  edU sing,  and 
hasCapability  relations  can  be  represented  using  tModels  with  the  instantiation  being 
stored  in  the  CategoryBag  attribute  of  a  businessService  fact.  Figure  14  contains  an 
excerpt  of  a  taxonomic  relationship  in  OntoSensor  and  Figure  15  contains  the 
corresponding  OWL  export.  An  instance  of  a  ThermacamSC500  could  be  stored  as  a 
sensor  service  in  the  UDDI  registry  and  is  described  initially  using  the  Prolog  fact 
businessService(‘ThermacamSC500’,  _,  _,  _,  ‘AserviceKey’,  ‘AbusinessKey’).  The  fact 
contains  no  knowledge  corresponding  with  the  taxonomic  hierarchy  in  Figure  14. 
Syntax-based  queries  for  sensor  service  instances  that  are  members  of  a  specialization  of 
PhotoConductive  or  MidWave  would  fail  to  find  instances  of  ThemacamSC500  and 
would  require  semantic  matching  to  locate  such  sensors.  Storing  the  taxonomic 
relationships  represented  in  Figure  14  in  UDDI  is  feasible  using  the  service’s 
categoryBag  schema  and  will  enhance  syntax-based  search. 


<owl:Class  rdf:ID=”ThermacamSC500”/> 
<rdfs :  subClassOf> 

<owl:Class  rdf:ID=”MidWaveIR”/> 
</rdfs:subClassOf> 

<rdfs :  subClassOf> 

<owl:Classrdf:ID=’’PhotoConductive’’/> 
</rdfs:subClassOf> 

Figure  14.  OntoSensor  excerpt _ |  Figure  15.  OntoSensor  OWL  excerpt _ 

Mapping  the  knowledge  in  Figure  15  into  the  UDDI  repository  requires  the 
creation  of  subClassOf  and  des  crib  edUsing  tModels.  Shown  in  Figures  16  and  18  are  the 
UDDI  XML  representations  of  the  des  crib  edUsing  and  subClassOf  tModels.  Shown  in 
Figures  17  and  19  are  the  Prolog  facts  representing  the  des  crib  edUsing  and  subClassOf 
tModels  implemented  in  the  prototype. 

The  taxonomic  hierarchy  shown  in  Figure  14  can  be  realized  using  the  subClassOf 
tModel  along  with  the  des  crib  edUsing  tModel  to  provide  a  link  to  the  external  ontology. 
Shown  in  Figure  20  is  the  UDDI  XML  representation  of  Figure  14  while  Figure  21  is  the 
corresponding  Prolog  implemented  in  the  prototype.  The  CategoryBag  attribute  of  a 
sensor  service,  for  example,  a  deployed  instance  of  the  ThemacamSC500  sensor, 
references  the  tModels  subClassOf  and  instanceOf  to  capture  the  taxonomic  relationships 
in  Figure  14.  Relationships,  such  as  hasCapability,  are  saved  using  similar  tModel 
structures  and  corresponding  Prolog  implementations  of  the  tModels. 


PhotoConductive 


<tModel 

tModelKey=”uddi :  describedUsing”> 
<name>describedUsing</name> 
</tModel> 

tModel(‘describedUsing’,  _,  _,  _, 
[[‘describedUsing’,  ’URL:  Onto  Sensor,  owl’]], 
Uddi:  describedUsing,  _,  _). 

Figure  16.  UDDI  describedUsing 
tModel 

Figure  17.  Prolog  describedUsing  tModel 
fact 
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<tModel 

tModelKey=”uddi:subClassOf’> 

<name>subClassOf</name> 

<categoryBag> 

<KeyedReference=”describedUsing” 
key  V  alue=”URL :  Onto  Sensor.  owl”> 
</categoryBag> 

</tModel> 

tModel(‘ subClassOf ,  _,  _,  _, 
[[‘describedUsing’, ’URL:OntSensor.owl’]], 
uddi:subClassOf,  _,  _). 

Figure  18.  UDDI  subClassOf  tModel 

Figure  19.  Prolog  subClassOf  tModel  fact 

<BusinessService> 

<name>ThermacamSC500</name> 

<categoryBag> 

<Key  edReference=  ’  describedUsing  ’ 
key Value=”url :  Onto  S  ensor.  owl”> 

<KeyedReference=  ’  subClassOf 
keyValue=”PhotoConductive”> 

<KeyedReference=’ subClassOf  keyValue=”MidWaveIR”> 

</categoryBag> _ 

Figure  20.  UDDI  businessService  tModel 


businessService(‘ThermacamSC500’,  _,  _, 

[[‘describedUsing’,  ’URL:  Onto  Sensor,  owl’], 

[‘subClassOf ,  ’PhotoConductive’], 

[‘subClassOf ,  ’MidWaveIR’], 

‘AserviceKey’,  ‘AbusinessKey’). _ 

Figure  21.  Prolog  businessService  tModel  fact 

4.3  Prolog  UDDI  API 

The  Prolog  realization  of  the  UDDI  model  implemented  in  the  prototype  omits  some 
of  the  capabilities  of  the  UDDI  API.  The  search  capabilities  maintained  are  find_service 
and  find_tModel.  The  Prolog  implementation  of  find_service  has  the  following 
arguments:  Name,  Description,  BusinessTemplate,  CategoryBag,  ServiceKey, 
BusinessKey,  ServiceList,  and,  Operator.  The  Operator  argument  can  have  values  OR 
and  AND  which  performs  logical  operations  on  the  search  parameters.  If  Operator  is 
NULL  the  default  behavior  is  logical  AND.  If  the  CategoryBag  argument  is  not  NULL, 
the  fmd_tModel/2  predicate  is  invoked  to  find  syntactic  matches  among  the  registered 
services’  categoryBags. 

4.4  Sensor  service  interface 
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The  sensor  service  interface  intercepts  service  requests  to  the  UDDI  repository  and 
invokes  matchmaking  services,  if  specified  by  the  requestor.  If  the  requestor  does  not 
specify  semantic  matchmaking,  the  service  request  is  passed  to  the  UDDI  repository. 
Shown  in  Figure  22  is  a  state-chart  diagram  depicting  the  process  of  the  sensor  service 
interface  attempting  to  satisfy  a  fmd_service  query. 


START 


State  1 


Operator  :=  SemanticMatching 


Uddi:find_service() 
Operator  !  =  SemanticMatching 


Find  domain  Ontologies 
referenced  by  services 


State  2 


Determine  if  there  are 
registered  matchmakers| 
for  the  ontology  domain 


END 


Semantic  matchmaker 
not  available 


Figure  22.  Sensor  service  interface  state-chart  diagram 


The  Operator  argument  of  the  fmd_service  permits  the  values  AND ,  OR ,  and 
SemanticMatch.  If  the  argument  does  not  have  the  value  SemanticMatch ,  the  service 
request  is  passed  to  the  UDDI  repository.  Otherwise,  the  sensor  service  interface 
searches  the  UDDI  repository  to  find  the  ontologies  referenced  by  the  services.  The 
matchmaker  then  searches  the  UDDI  registry  for  matchmakers  that  perform  matchmaking 
services  within  the  domain  of  the  referenced  ontology.  If  a  matchmaker  is  discovered,  it 
is  invoked  using  the  API  invocation  command  described  in  the  service  description. 

4.5  Semantic  sensor  matchmaker 

The  semantic  sensor  matchmaker  is  capable  of  performing  generalization, 
specialization,  performance  property,  supported  applications,  and  instance  semantic 
processing.  The  query  command  lists  that  control  the  behavior  of  the  sensor  matchmaker 
along  with  the  corresponding  predicates  invoked  are  shown  in  Table  1.  The  valid 
operations  that  can  be  performed  on  query  command  list  results  are  shown  in  Table  2. 
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Table  1.  API  query  definitions 


API  Query  Commands 

Description 

[‘specializationOf ,  CLASS] 

Invokes  find_specialization/2  predicate.  Find 
specialization  of  argument  CLASS. 

[‘generalizationOf ,  CLASS] 

Invokes  find_generalization/2  predicate.  Find 
generalization  of  argument  CLASS. 

[ 4  supportedApplication’ , 

Capability] 

Invokes  application_query/2  predicate.  Find 
sensor  service  that  has  the  supported  application 
Capability. 

[Object,  Value] 

Invokes  fmd_performance/2  predicate.  Find  sensor 
service  that  has  performance  property  Object  and 
Value.  If  the  performance_query/2  predicate  fails 
then  instance_query/2  predicate  is  invoked  to 
attempt  to  satisfy  the  query.  The  instance_query/2 
predicate  loads  instance  metadata  for  the  services 
to  satisfy  the  query. 

Table  2.  Operations  on  query  lists 


List  Operation 

Description 

[‘OR’,  QueryList] 

Performs  logical  OR  on  QueryList  result. 

r‘AND\  QueryListl 

Performs  logical  AND  on  QueryList  result. 

[OP1,  OP2,  OP3,  Listl,  List2] 

Performs  OPl(OP3(List2),  OP2(Listl)). 

OP1,  OP2,  and  OP3  perform  logical  OR  or 
AND  operations 

The  SWI-Prolog  Semantic  Web  library  (Wielemaker  et  al.,  2003)  is  used  in  the 
prototype  to  load  and  query  external  ontologies.  The  rdf_db  module  is  used  to  assert 
OntoSensor  into  the  Prolog  knowledge  base  as  RDF  triples.  The  rdf_load/l  predicate  is 
used  to  load  the  OntoSensor  ontology  into  rdf/3  facts  with  arguments  that  can  be  viewed 
as  Subject,  Verb,  and  Object.  The  RDF  triples  are  queried  using  Prolog  predicates. 

The  matchmaker  retrieves  the  services  in  the  UDDI  registry  that  subscribe  to 
OntoSensor.  The  UDDI  fmd_service/7  predicate  is  used  to  narrow  the  search  space  for 
the  matchmaker  as  well  as  to  minimize  the  number  of  queries  issued  to  the  UDDI  model. 
Locating  the  services  that  subscribe  to  the  domain  ontology  OntoSensor  requires  only  one 
call  to  the  UDDI  registry. 

The  semantic  processing  continues  until  the  list  of  query  commands  is  exhausted. 
Once  the  list  is  empty,  the  operations  in  Table  2  are  performed,  if  specified.  The  RDF 
triples  are  retracted  from  memory  and  the  result  of  the  semantic  processing  is  then 
returned  to  the  sensor  service  interface.  Retracting  the  RDF  triples  from  the 
matchmaker’s  knowledge  base  introduces  computational  overhead.  The  ontology  used 
by  the  matchmaker  can  be  updated  and  requires  reloading  RDF  triples  into  Prolog’s 
memory  for  each  request.  A  paging  methodology  or  expiration  window  for  reloading  the 
ontology  could  circumvent  the  computational  overhead,  but  that  process  is  beyond  the 
scope  of  the  initial  prototype. 

4.5.1  Specialization  query 
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The  specialization  processing  of  a  query  for  sensor  services  satisfying  a  given  sensor 
type  constraint  is  performed  by  invoking  the  Prolog  predicate 
is_platform_specialization/3  developed  for  the  prototype.  A  unique  list  of  deployed 
platforms  (note  that  a  platform  may  contain  several  sensor  types)  is  created  from  the 
UDDI  list  of  registered  sensor  services  and  is  one  argument  of  the 

is_platform_specialization/3  predicate.  The  predicate  uses  depth-first  search  to  determine 
if  one  of  the  platform’s  sensors  is  a  specialization  of  the  sensor  type  specified  in  the 
query.  In  the  prototype,  the  argument,  which  is  the  sensor  type  to  be  specialized  in  the 
original  query,  must  be  a  class  in  OntoSensor.  This  argument  is  used  as  the  root  node  to 
produce  a  sensor-type  taxonomic  hierarchy  that  will  serve  as  a  search  tree.  The  subtree 
produced  from  the  root  node  is  searched  to  determine  if  a  deployed  sensor  is  an  instance 
of  a  class  within  the  subtree.  If  a  class  is  found  in  which  a  registered  sensing  component 
is  an  instance,  the  corresponding  sensor  service  is  accumulated  in  a  list  and  the  search 
resumes  until  the  registered  service  list  is  empty.  If  matches  are  found,  the  portion  of  the 
semantic  data  model  pertinent  to  satisfying  the  query  is  stored  in  the  UDDI  tModels  via 
the  methodology  discussed  in  Section  4.2. 

4.5.2  Generalization  query 

The  generalization  processing  implemented  in  the  prototype  discovers  registered 
sensor  services  that  are  direct  members  of  a  generalization  of  the  specified  class  in  the 
query  or  sensor  services  that  are  members  of  a  specialization  of  one  or  more  of  the 
generalized  classes.  The  generalization  processing  is  performed  by  the 
fmd_generalization/3  predicate,  which  works  in  two  steps  using  the  predicates 
find_generalization/3  and  is_platform_specialization/3.  The  predicate 
fmd_generalization/3  searches  for  the  classes  that  are  a  generalization  of  the  sensor  type 
specified  in  the  query,  which  must  be  a  class  specified  in  OntoSensor.  The  predicate 
is_platform_specialization/3  is  invoked  to  find  specializations  of  the  generalized  classes. 

Determining  the  appropriate  degree  of  generality  for  a  query  is  beyond  the  scope 
of  the  current  implementation  of  the  prototype.  Possible  future  approaches  are  to 
quantify  the  degree  of  generality  needed  to  satisfy  a  particular  query  or  to  iteratively 
generalize  the  query  until  matches  are  found  and  there  are  no  additional  parent  classes. 
The  implementation  in  the  prototype  uses  a  very  liberal  definition  of  generalization.  For 
example,  the  OntoSensor  class  Sensor  is  the  most  general  sensor  type,  that  is,  it  is  the 
parent  node  for  all  sensor  classes.  The  maximum  extents  of  the  generalization  for  a 
sensor  type  specified  in  a  query  are  the  direct  subclasses  of  Sensor.  These  generalized 
classes  serve  as  root  nodes  to  generate  subtrees  consisting  of  sensor  type  subclasses.  In 
the  initial  prototype  implementation,  the  registered  sensor  platforms  having  components 
that  are  instances  in  this  search  space  are  considered  instances  of  generalizations  of  the 
sensor  type  specified  in  the  original  query. 

4.5.3  Supported  application  and  performance  property  query 

Figure  23  is  an  excerpt  of  a  class  diagram  used  in  OntoSensor.  A  sensor’s 
capabilities  have  been  captured  through  the  class  CapabilitiesDescription.  This  class  in 
turn  is  linked  to  the  GenericProperty  class  through  two  associations.  A  specific  sensor’s 
sensitivity  and  resolution  can  be  derived  through  query  of  the  performanceProperty 
association.  For  example,  noise  equivalent  temperature  difference  (NETD),  which  is  one 
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gross  measure  of  sensitivity,  can  be  determined  by  obtaining  the  values  of  certain  sensor 
parameters,  such  as  focal  length,  horizontal  and  vertical  field  of  view,  frame  rate,  and 
overscan  ratio.  Through  the  supportedApplications  association  queries  can  determine  the 
types  of  sensors  to  attempt  to  locate  and  task  based  on  some  application  criteria  like  GPS 
capability,  night  operation  capability,  foliage  penetration,  all  weather  capability,  etc. 

_Sensor| 


+hasCapabilities 


3t 


Capabilities  Description 


tperformanceProperty  1  SupportedApplications 


Generic  Property 


Figure  23.  Capabilities  description  UML  (Russomanno  &  Goodwin,  2007) 


Two  separate  predicates  are  used  for  finding  supported  applications  and 
performance  processing  in  the  prototype,  but  the  process  of  satisfying  the  queries  is 
similar.  The  CapabilitiesDescription  facts  are  queried  to  determine  if  any  of  the 
registered  sensors  satisfy  either  the  supported  application  or  performance  property 
constraint.  The  query  processing  continues  until  the  unique  list  of  registered  services  is 
exhausted.  The  semantic  metadata  used  to  satisfy  the  query  is  explicitly  stored  in  the 
UDDI  businessService  tModel  descriptions  of  the  registered  sensor  services  that  satisfied 
the  query. 

Instance  metadata  for  registered  services  is  loaded,  if  the  query  cannot  be  satisfied 
through  the  matchmaker’s  performance  property  knowledge  gained  from  OntoSensor. 

The  external  metadata  is  loaded  into  the  matchmaker’s  knowledge  base  using  the 
bindingTemplate  of  the  businessService  description,  which  contains  the  URI  for 
obtaining  the  metadata.  The  metadata  is  loaded  into  the  knowledge  base  and  the 
matchmaker  attempts  to  use  the  additional  knowledge  to  satisfy  the  query.  If  a  semantic 
match  is  found,  the  semantic  data  pertinent  for  satisfying  the  query  is  saved  in  the  UDDI 
tModels  via  the  methodology  discussed  in  Section  4.2. 


5.0  Conclusions 

This  paper  has  described  an  architecture  and  prototype  implementation  that 
integrates  a  sensor  ontology  within  a  service-oriented  architecture  (SOA)  to  support  the 
development  of  expert  system  applications.  The  prototype  environment  supports  the 
query  of  sensor  services,  which  are  registered  in  a  UDDI-compliant  repository, 
augmented  with  semantics  and  supports  queries  using  generalization  and  specialization 
via  reference  to  an  external  ontology,  as  well  as  queries  of  sensor  performance  properties, 
supported  applications,  and  metadata  about  deployed  sensor  instances.  The  prototype 
was  implemented  utilizing  Semantic  Web  infrastructure  to  overcome  some  of  the 
limitations  of  the  current  service-oriented  approaches  that  rely  solely  on  XML-data 
structures  and  syntax-based  search  mechanisms  for  discovery  of  Web  services. 
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Areas  of  future  research  and  development  include  integrating  the  prototype 
environment  with  a  geographical  information  system  containing  both  land-base 
references,  as  well  as  inside  plant  or  building  graphics  to  aid  in  the  discovery  and  query 
of  sensors  deployed  in  a  variety  of  locations.  The  composition  of  sensor  services  with 
other  Web  services,  including  algorithms  that  take  sensor  data  as  input,  perform  an 
operation,  and  return  a  classification  or  output  for  processing  by  subsequent  services,  is 
also  of  high  priority. 
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8.2  Appendix  B 

Develop  Image  Quality  Metrics  for  Fused  Imagery 


Figure  6:  MWIR  image 


Figure  7:  Color  Fused  image  to  show  registration 


Figure  8:  Pixel- averaged  (superposition)  image 
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Figure  10:  Multiscale  image  fusion 


Figure  11:  Laplacian  A  image  fusion 


Figure  12:  Contrast  Ratio  A  image  fusion 


Figure  13:  Discrete  wavelet  (DWT)  image  fusion 
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Figure  14:  Shift  invariant  DWT  image  fusion 


Figure  15:  Gabor  image  fusion 


Figure  16:  Principal  component  analysis  image  fusion 


Figure  17:  Pixel  averaging  image  fusion 
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